[10430010] |
Japanese language
[10430020] |{{Nihongo|'''Japanese'''|日本語 / にほんご |3=}} is a language spoken by over 130 million people in [[Japan]] and in Japanese emigrant communities. [10430030] |It is related to the [[Ryukyuan languages]], but whatever [[Classification of the Japanese language|relationships with other languages]] it may have remain undemonstrated. [10430040] |It is an [[agglutinative language]] and is distinguished by a complex system of [[Honorific speech in Japanese|honorifics]] reflecting the hierarchical nature of Japanese society, with verb forms and particular vocabulary to indicate the relative status of speaker, listener and the third person mentioned in conversation whether he is there or not. [10430050] |The sound inventory of Japanese is relatively small, and it has a lexically distinct [[Japanese pitch accent|pitch-accent]] system. [10430060] |It is a [[mora (linguistics)|mora]]-timed language. [10430070] |The Japanese language is written with a combination of three different types of scripts: [[Chinese characters]] called ''[[kanji]]'' (漢字 / かんじ), and two [[syllabary|syllabic]] scripts made up of modified [[Chinese characters]], ''[[hiragana]]'' (平仮名 / ひらがな) and ''[[katakana]]'' (片仮名 / カタカナ). [10430080] |The [[Latin alphabet]], ''[[rōmaji]]'' (ローマ字), is also often used in modern Japanese, especially for company names and logos, advertising, and when entering Japanese text into a computer. [10430090] |Western style [[Arabic numerals]] are generally used for numbers, but traditional [[Sino-Japanese vocabulary|Sino-Japanese]] numerals are also commonplace. [10430100] |Japanese [[vocabulary]] has been heavily influenced by [[loanword]]s from other languages. [10430110] |A vast number of words were borrowed from [[Chinese language|Chinese]], or created from Chinese models, over a period of at least 1,500 years. [10430120] |Since the late 19th century, Japanese has borrowed a considerable number of words from [[Indo-European languages]], primarily [[English language|English]]. [10430130] |Because of the special trade relationship between Japan and first [[Portugal]] in the 16th century, and then mainly the [[Netherlands]] in the 17th century, [[Portuguese language|Portuguese]], [[German language|German]] and [[Dutch language|Dutch]] have also been influential. [10430140] |== Geographic distribution == [10430150] |Although Japanese is spoken almost exclusively in Japan, it has been and sometimes still is spoken elsewhere. [10430160] |When [[Imperial Japan|Japan]] occupied [[Korea]], [[Taiwan]], parts of the [[Chinese mainland]], and various Pacific islands before and during [[World War II]], locals in [[Greater East Asia Co-Prosperity Sphere|those countries]] were forced to learn Japanese in empire-building programs. [10430170] |As a result, there are many people in these countries who can speak Japanese in addition to the local languages. [10430180] |Japanese emigrant communities (the largest of which are to be found in [[Brazil]]) sometimes employ Japanese as their primary language. [10430190] |Approximately 5% of Hawaii residents speak Japanese, with Japanese ancestry the largest single ancestry in the state (over 24% of the population). [10430200] |Japanese emigrants can also be found in [[Peru]], [[Argentina]], [[Australia]] (especially [[Sydney]], [[Brisbane]], and [[Melbourne]]), the [[United States]] (notably [[California]], where 1.2% of the population has Japanese ancestry, and [[Hawaii]]), and the [[Philippines]] (particularly in [[Davao]] and [[Laguna (province)|Laguna]]). [10430210] |Their descendants, who are known as {{transl|ja|''[[nikkei]]''}} ({{lang|ja|日系}}, literally Japanese descendants), however, rarely speak Japanese fluently after the second generation. [10430220] |There are estimated to be several million non-Japanese studying the language as well. [10430230] |=== Official status === [10430240] |Japanese is the de facto official language of Japan. [10430250] |There is a form of the language considered standard: {{nihongo|''hyōjungo''|標準語|}} Standard Japanese, or {{nihongo|''kyōtsūgo''|共通語|}} the common language. [10430260] |The meanings of the two terms are almost the same. [10430270] |{{transl|ja|''Hyōjungo''}} or {{transl|ja|''kyōtsūgo''}} is a conception that forms the counterpart of dialect. [10430280] |This normative language was born after the {{nihongo|[[Meiji Restoration]]|明治維新|meiji ishin|1868}} from the language spoken in uptown [[Tokyo]] for communicating necessity. [10430290] |{{transl|ja|''Hyōjungo''}} is taught in schools and used on television and in official communications, and is the version of Japanese discussed in this article. [10430300] |Formerly, standard {{nihongo|Japanese in writing|文語|[[Bungo (Japanese language)|bungo]]|"literary language"}} was different from {{nihongo|colloquial language|口語|[[Kogo (Japanese language)|kōgo]]}}. [10430310] |The two systems have different rules of grammar and some variance in vocabulary. [10430320] |{{transl|ja|''Bungo''}} was the main method of writing Japanese until about 1900; since then {{transl|ja|''kōgo''}} gradually extended its influence and the two methods were both used in writing until the 1940s. [10430330] |{{transl|ja|''Bungo''}} still has some relevance for historians, literary scholars, and lawyers (many Japanese laws that survived [[World War II]] are still written in {{transl|ja|''bungo''}}, although there are ongoing efforts to modernize their language). [10430340] |{{transl|ja|''Kōgo''}} is the predominant method of both speaking and writing Japanese today, although {{transl|ja|''bungo''}} grammar and vocabulary are occasionally used in modern Japanese for effect. [10430350] |=== Dialects === [10430360] |Dozens of dialects are spoken in Japan. [10430370] |The profusion is due to many factors, including the length of time the [[Japanese Archipelago|archipelago]] has been inhabited, its mountainous island terrain, and Japan's long history of both external and internal isolation. [10430380] |Dialects typically differ in terms of [[Japanese pitch accent|pitch accent]], inflectional [[morphology (linguistics)|morphology]], [[vocabulary]], and particle usage. [10430390] |Some even differ in [[vowel]] and [[consonant]] inventories, although this is uncommon. [10430400] |The main distinction in Japanese accents is between {{nihongo|Tokyo-type|東京式|Tōkyō-shiki}} and {{nihongo|Kyoto-Osaka-type|京阪式|Keihan-shiki}}, though Kyūshū-type dialects form a third, smaller group. [10430410] |Within each type are several subdivisions. [10430420] |Kyoto-Osaka-type dialects are in the central region, with borders roughly formed by [[Toyama Prefecture|Toyama]], [[Kyoto Prefecture|Kyōto]], [[Hyōgo Prefecture|Hyōgo]], and [[Mie Prefecture|Mie]] Prefectures; most [[Shikoku]] dialects are also that type. [10430430] |The final category of dialects are those that are descended from the Eastern dialect of [[Old Japanese]]; these dialects are spoken in [[Hachijōjima|Hachijō-jima island]] and few islands. [10430440] |Dialects from peripheral regions, such as [[Tōhoku Region|Tōhoku]] or [[Tsushima Island|Tsushima]], may be unintelligible to speakers from other parts of the country. [10430450] |The several dialects of [[Kagoshima Prefecture|Kagoshima]] in southern [[Kyūshū]] are famous for being unintelligible not only to speakers of standard Japanese but to speakers of nearby dialects elsewhere in Kyūshū as well. [10430460] |This is probably due in part to the Kagoshima dialects' peculiarities of pronunciation, which include the existence of closed syllables (i.e., syllables that end in a consonant, such as {{IPA|/kob/}} or {{IPA|/koʔ/}} for Standard Japanese {{IPA|/kumo/}} "spider"). [10430470] |A dialects group of [[Kansai region|Kansai]] is spoken and known by many Japanese, and [[Osaka]] dialect in particular is associated with comedy (See [[Kansai dialect]]). [10430480] |Dialects of Tōhoku and North [[Kantō region|Kantō]] are associated with typical farmers. [10430490] |The [[Ryūkyūan languages]], spoken in [[Okinawa Prefecture|Okinawa]] and [[Amami Islands]] that are politically part of [[Kagoshima Prefecture|Kagoshima]], are distinct enough to be considered a separate branch of the [[Japonic languages|Japonic]] family. [10430500] |But many Japanese common people tend to consider the Ryūkyūan languages as dialects of Japanese. [10430510] |Not only is each language unintelligible to Japanese speakers, but most are unintelligible to those who speak other Ryūkyūan languages. [10430520] |Recently, Standard Japanese has become prevalent nationwide (including the Ryūkyū islands) due to [[education]], [[mass media]], and increase of mobility networks within Japan, as well as economic integration. [10430530] |== Sounds == [10430540] |{{IPA notice}} [10430550] |Japanese vowels are "pure" sounds. [10430560] |The only unusual vowel is the high back vowel {{IPA|/ɯ/}} , which is like {{IPA|/u/}}, but [[roundedness|compressed]] instead of rounded. [10430570] |Japanese has five vowels, and [[vowel length]] is phonemic, so each one has both a short and a long version. [10430580] |Some Japanese consonants have several [[allophone]]s, which may give the impression of a larger inventory of sounds. [10430590] |However, some of these allophones have since become phonemic. [10430600] |For example, in the Japanese language up to and including the first half of the twentieth century, the phonemic sequence {{IPA|/ti/}} was [[palatalization|palatalized]] and realized phonetically as {{IPA|[tɕi]}}, approximately ''chi'' ; however, now {{IPA|/ti/}} and {{IPA|/tɕi/}} are distinct, as evidenced by words like ''tī'' {{IPA|[tiː]}} "Western style tea" and ''chii'' {{IPA|[tɕii]}} "social status." [10430610] |The 'r' of the Japanese language (technically a [[lateral consonant|lateral]] [[apical consonant|apical]] postalveolar flap), is of particular interest, sounding to most English speakers to be something between an 'l' and a [[retroflex consonant|retroflex]] 'r' depending on its position in a word. [10430620] |The syllabic structure and the [[phonotactics]] are very simple: the only [[consonant cluster]]s allowed within a syllable consist of one of a subset of the consonants plus {{IPA|/j/}}. [10430630] |These type of clusters only occur in onsets. [10430640] |However, consonant clusters across syllables are allowed as long as the two consonants are a nasal followed by a [[homo-organic]] consonant. [10430650] |[[Consonant length]] (gemination) is also phonemic. [10430660] |== Grammar == [10430670] |=== Sentence structure === [10430680] |Japanese word order is classified as [[Subject Object Verb]]. [10430690] |However, unlike many [[Indo-European language]]s, Japanese sentences only require that verbs come last for intelligibility. [10430700] |This is because the Japanese [[sentence element]]s are marked with [[Japanese particles|particles]] that identify their grammatical functions. [10430710] |The basic sentence structure is [[topic-comment]]. [10430720] |For example, {{transl|ja|''Kochira-wa Tanaka-san desu''}} ({{lang|ja|こちらは田中さんです}}). [10430730] |{{transl|ja|''Kochira''}} ("this") is the topic of the sentence, indicated by the particle ''-wa''. [10430740] |The verb is {{transl|ja|''desu''}}, a [[copula]], commonly translated as "to be" or "it is" (though there are other verbs that can be translated as "to be"). [10430750] |As a phrase, {{transl|ja|''Tanaka-san desu''}} is the comment. [10430760] |This sentence loosely translates to "As for this person, (it) is Mr./Mrs./Miss Tanaka." [10430770] |Thus Japanese, like [[Chinese language|Chinese]], [[Korean language|Korean]], and many other Asian languages, is often called a [[topic-prominent language]], which means it has a strong tendency to indicate the topic separately from the subject, and the two do not always coincide. [10430780] |The sentence {{transl|ja|''Zō-wa hana-ga nagai (desu)''}} ({{lang|ja|象は鼻が長いです}}) literally means, "As for elephants, (their) noses are long". [10430790] |The topic is {{transl|ja|''zō''}} "elephant", and the subject is {{transl|ja|''hana''}} "nose". [10430800] |Japanese is a [[pro-drop language]], meaning that the subject or object of a sentence need not be stated if it is obvious from context. [10430810] |In addition, it is commonly felt, particularly in spoken Japanese, that the shorter a sentence is, the better. [10430820] |As a result of this grammatical permissiveness and tendency towards brevity, Japanese speakers tend naturally to omit words from sentences, rather than refer to them with [[pronoun]]s. [10430830] |In the context of the above example, {{transl|ja|''hana-ga nagai''}} would mean "[their] noses are long," while {{transl|ja|''nagai''}} by itself would mean "[they] are long." [10430840] |A single verb can be a complete sentence: {{transl|ja|''Yatta!''}} [10430850] |"[I / we / they / etc] did [it]!". [10430860] |In addition, since adjectives can form the predicate in a Japanese sentence (below), a single adjective can be a complete sentence: {{transl|ja|''Urayamashii!''}} [10430870] |"[I'm] jealous [of it]!". [10430880] |While the language has some words that are typically translated as pronouns, these are not used as frequently as pronouns in some [[Indo-European language]]s, and function differently. [10430890] |Instead, Japanese typically relies on special verb forms and auxiliary verbs to indicate the direction of benefit of an action: "down" to indicate the out-group gives a benefit to the in-group; and "up" to indicate the in-group gives a benefit to the out-group. [10430900] |Here, the in-group includes the speaker and the out-group doesn't, and their boundary depends on context. [10430910] |For example, {{transl|ja|''oshiete moratta''}} (literally, "explained" with a benefit from the out-group to the in-group) means "[he/she/they] explained it to [me/us]". [10430920] |Similarly, {{transl|ja|''oshiete ageta''}} (literally, "explained" with a benefit from the in-group to the out-group) means "[I/we] explained [it] to [him/her/them]". [10430930] |Such beneficiary auxiliary verbs thus serve a function comparable to that of pronouns and prepositions in Indo-European languages to indicate the actor and the recipient of an action. [10430940] |Japanese "pronouns" also function differently from most modern Indo-European pronouns (and more like nouns) in that they can take modifiers as any other noun may. [10430950] |For instance, one cannot say in English: [10430960] |: [10430970] |*The amazed he ran down the street. (grammatically incorrect) [10430980] |But one ''can'' grammatically say essentially the same thing in Japanese: [10430990] |: {{transl|ja|''Odoroita kare-wa michi-o hashitte itta.''}} (grammatically correct) [10431000] |This is partly due to the fact that these words evolved from regular nouns, such as {{transl|ja|''kimi''}} "you" ({{lang|ja|君}} "lord"), {{transl|ja|''anata''}} "you" ({{lang|ja|あなた}} "that side, yonder"), and {{transl|ja|''boku''}} "I" ({{lang|ja|僕}} "servant"). [10431010] |This is why some linguists do not classify Japanese "pronouns" as pronouns, but rather as referential nouns. [10431020] |Japanese personal pronouns are generally used only in situations requiring special emphasis as to who is doing what to whom. [10431030] |The choice of words used as pronouns is correlated with the sex of the speaker and the social situation in which they are spoken: men and women alike in a formal situation generally refer to themselves as {{transl|ja|''watashi''}} ({{lang|ja|私}} "private") or {{transl|ja|''watakushi''}} (also {{lang|ja|私}}), while men in rougher or intimate conversation are much more likely to use the word {{transl|ja|''ore''}} ({{lang|ja|俺}} "oneself", "myself") or {{transl|ja|''boku''}}. [10431040] |Similarly, different words such as {{transl|ja|''anata''}}, {{transl|ja|''kimi''}}, and {{transl|ja|''omae''}} ({{lang|ja|お前}}, more formally {{lang|ja|御前}} "the one before me") may be used to refer to a listener depending on the listener's relative social position and the degree of familiarity between the speaker and the listener. [10431050] |When used in different social relationships, the same word may have positive (intimate or respectful) or negative (distant or disrespectful) connotations. [10431060] |Japanese often use titles of the person referred to where pronouns would be used in English. [10431070] |For example, when speaking to one's teacher, it is appropriate to use {{transl|ja|''sensei''}} ({{lang|ja|先生}}, teacher), but inappropriate to use {{transl|ja|''anata''}}. [10431080] |This is because {{transl|ja|''anata''}} is used to refer to people of equal or lower status, and one's teacher has allegedly higher status. [10431090] |For English speaking learners of Japanese, a frequent beginners mistake is to include {{transl|ja|''watashi-wa''}} or {{transl|ja|''anata-wa''}} at the beginning of sentences as one would with ''I'' or ''you'' in English. [10431100] |Though these sentences are not grammatically incorrect, even in formal settings it would be considered unnatural and would equate in English to repeatedly using a noun where a [[pronoun]] would suffice. [10431110] |=== Inflection and conjugation === [10431120] |Japanese nouns have no grammatical number, gender or article aspect. [10431130] |The noun {{transl|ja|''hon''}} ({{lang|ja|本}}) may refer to a single book or several books; {{transl|ja|''hito''}} ({{lang|ja|人}}) can mean "person" or "people"; and {{transl|ja|''ki''}} ({{lang|ja|木}}) can be "tree" or "trees". [10431140] |Where number is important, it can be indicated by providing a quantity (often with a [[Japanese counter word|counter word]]) or (rarely) by adding a suffix. [10431150] |Words for people are usually understood as singular. [10431160] |Thus {{transl|ja|''Tanaka-san''}} usually means ''Mr./Mrs./Miss. Tanaka''. [10431170] |Words that refer to people and animals can be made to indicate a group of individuals through the addition of a collective suffix (a noun suffix that indicates a group), such as {{transl|ja|''-tachi''}}, but this is not a true plural: the meaning is closer to the English phrase "and company". [10431180] |A group described as {{transl|ja|''Tanaka-san-tachi''}} may include people not named Tanaka. [10431190] |Some Japanese nouns are effectively plural, such as {{transl|ja|''hitobito''}} "people" and {{transl|ja|''wareware''}} "we/us", while the word {{transl|ja|''tomodachi''}} "friend" is considered singular, although plural in form. [10431200] |Verbs are [[Japanese verb conjugations|conjugated]] to show tenses, of which there are two: past and present, or non-past, which is used for the present and the future. [10431210] |For verbs that represent an ongoing process, the ''-te iru'' form indicates a continuous (or progressive) tense. [10431220] |For others that represent a change of state, the {{transl|ja|''-te iru''}} form indicates a perfect tense. [10431230] |For example, {{transl|ja|''kite iru''}} means "He has come (and is still here)", but {{transl|ja|''tabete iru''}} means "He is eating". [10431240] |Questions (both with an interrogative pronoun and yes/no questions) have the same structure as affirmative sentences, but with intonation rising at the end. [10431250] |In the formal register, the question particle {{transl|ja|''-ka''}} is added. [10431260] |For example, {{transl|ja|''Ii desu''}} ({{lang|ja|いいです。}}) "It is OK" becomes {{transl|ja|''Ii desu-ka''}} ({{lang|ja|いいですか?}}) "Is it OK?". [10431270] |In a more informal tone sometimes the particle {{transl|ja|''-no''}} ({{lang|ja|の}}) is added instead to show a personal interest of the speaker: {{transl|ja|''Dōshite konai-no?''}} [10431280] |"Why aren't (you) coming?". [10431290] |Some simple queries are formed simply by mentioning the topic with an interrogative intonation to call for the hearer's attention: {{transl|ja|''Kore-wa?''}} [10431300] |"(What about) this?"; {{transl|ja|''Namae-wa?''}} ({{lang|ja|名前は?}}) "(What's your) name?". [10431310] |Negatives are formed by inflecting the verb. [10431320] |For example, {{transl|ja|''Pan-o taberu''}} ({{lang|ja|パンを食べる。}}) "I will eat bread" or "I eat bread" becomes {{transl|ja|''Pan-o tabenai''}} ({{lang|ja|パンを食べない。}}) "I will not eat bread" or "I do not eat bread". [10431330] |The so-called {{transl|ja|''-te''}} verb form is used for a variety of purposes: either progressive or perfect aspect (see above); combining verbs in a temporal sequence ({{transl|ja|''Asagohan-o tabete sugu dekakeru''}} "I'll eat breakfast and leave at once"), simple commands, conditional statements and permissions ({{transl|ja|''Dekakete-mo ii?''}} "May I go out?"), etc. [10431340] |The word {{transl|ja|''da''}} (plain), {{transl|ja|''desu''}} (polite) is the [[copula]] verb. [10431350] |It corresponds approximately to the English ''be'', but often takes on other roles, including a marker for tense, when the verb is conjugated into its past form {{transl|ja|''datta''}} (plain), {{transl|ja|''deshita''}} (polite). [10431360] |This comes into use because only {{transl|ja|''keiyōshi''}} adjectives and verbs can carry tense in Japanese. [10431370] |Two additional common verbs are used to indicate existence ("there is") or, in some contexts, property: {{transl|ja|''aru''}} (negative {{transl|ja|''nai''}}) and {{transl|ja|''iru''}} (negative {{transl|ja|''inai''}}), for inanimate and animate things, respectively. [10431380] |For example, {{transl|ja|''Neko ga iru''}} "There's a cat", {{transl|ja|''Ii kangae-ga nai''}} "[I] haven't got a good idea". [10431390] |Note that the negative forms of the verbs {{transl|ja|''iru''}} and {{transl|ja|''aru''}} are actually ''i''-adjectives and inflect as such, e.g. {{transl|ja|''Neko ga inakatta''}} "There was no cat". [10431400] |The verb "to do" ({{transl|ja|''suru''}}, polite form {{transl|ja|''shimasu''}}) is often used to make verbs from nouns ({{transl|ja|''ryōri suru''}} "to cook", {{transl|ja|''benkyō suru''}} "to study", etc.) and has been productive in creating modern slang words. [10431410] |Japanese also has a huge number of compound verbs to express concepts that are described in English using a verb and a preposition (e.g. {{transl|ja|''tobidasu''}} "to fly out, to flee," from {{transl|ja|''tobu''}} "to fly, to jump" + {{transl|ja|''dasu''}} "to put out, to emit"). [10431420] |There are three types of [[Japanese adjectives|adjective]] (see also [[Japanese adjectives]]): [10431430] |# {{lang|ja|形容詞}} {{transl|ja|''keiyōshi''}}, or {{transl|ja|''i''}} adjectives, which have a [[Japanese verb conjugations|conjugating]] ending {{transl|ja|''i''}} ({{lang|ja|い}}) (such as {{lang|ja|あつい}} {{transl|ja|''atsui''}} "to be hot") which can become past ({{lang|ja|あつかった}} {{transl|ja|''atsukatta''}} "it was hot"), or negative ({{lang|ja|あつくない}} {{transl|ja|''atsuku nai''}} "it is not hot"). [10431440] |Note that {{transl|ja|''nai''}} is also an {{transl|ja|''i''}} adjective, which can become past ({{lang|ja|あつくなかった}} {{transl|ja|''atsuku nakatta''}} "it was not hot"). [10431450] |#: {{lang|ja|暑い日}} {{transl|ja|''atsui hi''}} "a hot day". [10431460] |# {{lang|ja|形容動詞}} {{transl|ja|''keiyōdōshi''}}, or {{transl|ja|''na''}} adjectives, which are followed by a form of the [[copula]], usually {{transl|ja|''na''}}. [10431470] |For example {{transl|ja|''hen''}} (strange) [10431480] |#: {{lang|ja|変なひと}} {{transl|ja|''hen na hito''}} "a strange person". [10431490] |# {{lang|ja|連体詞}} {{transl|ja|''rentaishi''}}, also called true adjectives, such as {{transl|ja|''ano''}} "that" [10431500] |#: {{lang|ja|あの山}} {{transl|ja|''ano yama''}} "that mountain". [10431510] |Both {{transl|ja|''keiyōshi''}} and {{transl|ja|''keiyōdōshi''}} may [[predicate (grammar)|predicate]] sentences. [10431520] |For example, [10431530] |: {{lang|ja|ご飯が熱い。}} {{transl|ja|''Gohan-ga atsui.''}} [10431540] |"The rice is hot." [10431550] |: {{lang|ja|彼は変だ。}} {{transl|ja|''Kare-wa hen da.''}} [10431560] |"He's strange." [10431570] |Both inflect, though they do not show the full range of conjugation found in true verbs. [10431580] |The {{transl|ja|''rentaishi''}} in Modern Japanese are few in number, and unlike the other words, are limited to directly modifying nouns. [10431590] |They never predicate sentences. [10431600] |Examples include {{transl|ja|''ookina''}} "big", {{transl|ja|''kono''}} "this", {{transl|ja|''iwayuru''}} "so-called" and {{transl|ja|''taishita''}} "amazing". [10431610] |Both {{transl|ja|''keiyōdōshi''}} and {{transl|ja|''keiyōshi''}} form [[adverb]]s, by following with {{transl|ja|''ni''}} in the case of {{transl|ja|''keiyōdōshi''}}: [10431620] |: {{lang|ja|変になる}} {{transl|ja|''hen ni naru''}} "become strange", [10431630] |and by changing {{transl|ja|''i''}} to {{transl|ja|''ku''}} in the case of {{transl|ja|''keiyōshi''}}: [10431640] |: {{lang|ja|熱くなる}} {{transl|ja|''atsuku naru''}} "become hot". [10431650] |The grammatical function of nouns is indicated by [[postposition]]s, also called [[Japanese particles|particles]]. [10431660] |These include for example: [10431670] |* '''{{lang|ja|が}} {{transl|ja|''ga''}}''' for the [[nominative case]]. [10431680] |Not necessarily a subject. [10431690] |: {{lang|ja|''彼'''が'''やった。''}}{{transl|ja|''Kare '''ga''' yatta.''}} [10431700] |"'''He''' did it." [10431710] |* '''{{lang|ja|に}} {{transl|ja|''ni''}}''' for the [[dative case]]. [10431720] |: {{lang|ja|田中さん'''に'''あげて下さい。}} {{transl|ja|''Tanaka-san '''ni''' agete kudasai''}} "Please give it to '''Mr. Tanaka'''." [10431730] |It is also used for the [[lative]] case, indicating a motion to a location. [10431740] |: {{lang|ja|''日本'' '''に'''行きたい。}} {{transl|ja|'''''Nihon''' '''ni''' ikitai''}} "I want to go ''to'' '''Japan'''." [10431750] |* '''{{lang|ja|の}} {{transl|ja|''no''}}''' for the [[genitive case]], or nominalizing phrases. [10431760] |: {{lang|ja|私'''の'''カメラ。}} {{transl|ja|''watashi '''no''' kamera''}} "'''my''' camera" [10431770] |: {{lang|ja|スキーに行く'''の'''が好きです。}} {{transl|ja|''Sukī-ni iku '''no''' ga suki desu''}} "(I) like go'''ing''' skiing." [10431780] |* '''{{lang|ja|を}} {{transl|ja|''o''}}''' for the [[accusative case]]. [10431790] |Not necessarily an object. [10431800] |: {{lang|ja|何'''を'''食べますか。}} {{transl|ja|''Nani '''o''' tabemasu ka?''}} [10431810] |"'''What''' will (you) eat?" [10431820] |* '''{{lang|ja|は}} {{transl|ja|''wa''}}''' for the topic. [10431830] |It can co-exist with case markers above except {{transl|ja|''no''}}, and it overrides {{transl|ja|''ga''}} and {{transl|ja|''o''}}. [10431840] |: {{lang|ja|私'''は'''タイ料理がいいです。}} {{transl|ja|''Watashi '''wa''' tai-ryōri ga ii desu.''}} [10431850] |"As for me, Thai food is good." [10431860] |The nominative marker {{transl|ja|''ga''}} after {{transl|ja|''watashi''}} is hidden under {{transl|ja|''wa''}}. [10431865] |(Note that English generally makes no distinction between sentence topic and subject.) [10431867] |Note: The difference between {{transl|ja|'''''wa'''''}} and {{transl|ja|'''''ga'''''}} goes beyond the English distinction between sentence topic and subject. [10431870] |While {{transl|ja|''wa''}} indicates the topic, which the rest of the sentence describes or acts upon, it carries the implication that the subject indicated by {{transl|ja|''wa''}} is not unique, or may be part of a larger group. [10431880] |: {{transl|ja|''Ikeda-san '''wa''' yonjū-ni sai da.''}} [10431890] |"As for Mr. Ikeda, he is forty-two years old." [10431900] |Others in the group may also be of that age. [10431910] |Absence of {{transl|ja|''wa''}} often means the subject is the [[focus (linguistics)|focus]] of the sentence. [10431920] |: {{transl|ja|''Ikeda-san '''ga''' yonjū-ni sai da.''}} [10431930] |"It is Mr. Ikeda who is forty-two years old." [10431940] |This is a reply to an implicit or explicit question who in this group is forty-two years old. [10431950] |=== Politeness === [10431960] |Unlike most western languages, Japanese has an extensive grammatical system to express politeness and formality. [10431970] |Most relationships are not equal in Japanese [[society]]. [10431980] |The differences in social position are determined by a variety of factors including job, age, experience, or even psychological state (e.g., a person asking a favour tends to do so politely). [10431990] |The person in the lower position is expected to use a polite form of speech, whereas the other might use a more plain form. [10432000] |Strangers will also speak to each other politely. [10432010] |Japanese children rarely use polite speech until they are teens, at which point they are expected to begin speaking in a more adult manner. [10432020] |''See [[uchi-soto]]''. [10432030] |Whereas {{transl|ja|''teineigo''}} ({{lang|ja|丁寧語}}) (polite language) is commonly an [[inflection]]al system, {{transl|ja|''sonkeigo''}} ({{lang|ja|尊敬語}}) (respectful language) and {{transl|ja|''kenjōgo''}} ({{lang|ja|謙譲語}}) (humble language) often employ many special honorific and humble alternate verbs: {{transl|ja|''iku''}} "go" becomes {{transl|ja|''ikimasu''}} in polite form, but is replaced by {{transl|ja|''irassharu''}} in honorific speech and {{transl|ja|''ukagau''}} or {{transl|ja|''mairu''}} in humble speech. [10432040] |The difference between honorific and humble speech is particularly pronounced in the Japanese language. [10432050] |Humble language is used to talk about oneself or one's own group (company, family) whilst honorific language is mostly used when describing the interlocutor and his/her group. [10432060] |For example, the {{transl|ja|''-san''}} suffix ("Mr" "Mrs." or "Miss") is an example of honorific language. [10432070] |It is not used to talk about oneself or when talking about someone from one's company to an external person, since the company is the speaker's "group". [10432080] |When speaking directly to one's superior in one's company or when speaking with other employees within one's company about a superior, a Japanese person will use vocabulary and inflections of the honorific register to refer to the in-group superior and his or her speech and actions. [10432090] |When speaking to a person from another company (i.e., a member of an out-group), however, a Japanese person will use the plain or the humble register to refer to the speech and actions of his or her own in-group superiors. [10432100] |In short, the register used in Japanese to refer to the person, speech, or actions of any particular individual varies depending on the relationship (either in-group or out-group) between the speaker and listener, as well as depending on the relative status of the speaker, listener, and third-person referents. [10432110] |For this reason, the Japanese system for explicit indication of social register is known as a system of "relative honorifics." [10432120] |This stands in stark contrast to the [[Korean language|Korean]] system of "absolute honorifics," in which the same register is used to refer to a particular individual (e.g. one's father, one's company president, etc.) in any context regardless of the relationship between the speaker and interlocutor. [10432130] |Thus, polite Korean speech can sound very presumptuous when translated verbatim into Japanese, as in Korean it is acceptable and normal to say things like "Our '''Mr.''' Company-President..." when communicating with a member of an out-group, which would be very inappropriate in a Japanese social context. [10432140] |Most [[noun]]s in the Japanese language may be made polite by the addition of {{transl|ja|''o-''}} or {{transl|ja|''go-''}} as a prefix. [10432145] |{{transl|ja|''o-''}} is generally used for words of native Japanese origin, whereas {{transl|ja|''go-''}} is affixed to words of Chinese derivation. [10432150] |In some cases, the prefix has become a fixed part of the word, and is included even in regular speech, such as {{transl|ja|''gohan''}} 'cooked rice; meal.' [10432160] |Such a construction often indicates deference to either the item's owner or to the object itself. [10432170] |For example, the word {{transl|ja|''tomodachi''}} 'friend,' would become {{transl|ja|''o-tomodachi''}} when referring to the friend of someone of higher status (though mothers often use this form to refer to their children's friends). [10432180] |On the other hand, a polite speaker may sometimes refer to {{transl|ja|''mizu''}} 'water' as {{transl|ja|''o-mizu''}} in order to show politeness. [10432190] |Most Japanese people employ politeness to indicate a lack of familiarity. [10432200] |That is, they use polite forms for new acquaintances, but if a relationship becomes more intimate, they no longer use them. [10432210] |This occurs regardless of age, social class, or gender. [10432220] |== Vocabulary == [10432230] |The original language of Japan, or at least the original language of a certain population that was ancestral to a significant portion of the historical and present Japanese nation, was the so-called {{transl|ja|''yamato kotoba''}} ({{lang|ja|大和言葉}} or infrequently {{lang|ja|大和詞}}, i.e. "[[Yamato people|Yamato]] words"), which in scholarly contexts is sometimes referred to as {{transl|ja|''wa-go''}} ({{lang|ja|和語}} or rarely {{lang|ja|倭語}}, i.e. the {{transl|ja|"[[Wa (Japan)|Wa]]}} words"). [10432240] |In addition to words from this original language, present-day Japanese includes a great number of words that were either borrowed from [[Chinese language|Chinese]] or constructed from Chinese roots following Chinese patterns. [10432250] |These words, known as {{transl|ja|''[[Sino-Japanese vocabulary|kango]]''}} ({{lang|ja|漢語}}), entered the language from the fifth century onwards via contact with Chinese culture. [10432260] |According to a [[Japanese dictionary]] ''Shinsen-kokugojiten'' (新選国語辞典), [[Sino-Japanese vocabulary|Chinese-based words]] comprise 49.1% of the total vocabulary, Wago is 33.8% and other foreign words are 8.8%. [10432270] |Like Latin-derived words in English, {{transl|ja|''[[Sino-Japanese vocabulary|kango]]''}} words typically are perceived as somewhat formal or academic compared to equivalent Yamato words. [10432280] |Indeed, it is generally fair to say that an English word derived from Latin/French roots typically corresponds to a Sino-Japanese word in Japanese, whereas a simpler Anglo-Saxon word would best be translated by a Yamato equivalent. [10432290] |A much smaller number of words has been borrowed from [[Korean language|Korean]] and [[Ainu language|Ainu]]. [10432300] |Japan has also borrowed a number of words from other languages, particularly ones of European extraction, which are called {{transl|ja|''[[gairaigo]]''}}. [10432310] |This began with [[Japanese words of Portuguese origin|borrowings from Portuguese]] in the 16th century, followed by borrowing from [[Dutch language|Dutch]] during Japan's [[sakoku|long isolation]] of the [[Edo period]]. [10432320] |With the [[Meiji Restoration]] and the reopening of Japan in the 19th century, borrowing occurred from [[German language|German]], [[French language|French]] and [[English language|English]]. [10432330] |Currently, words of English origin are the most commonly borrowed. [10432340] |In the Meiji era, the Japanese also coined many neologisms using Chinese roots and morphology to translate Western concepts. [10432350] |The Chinese and Koreans imported many of these pseudo-Chinese words into [[Chinese language|Chinese]], [[Korean language|Korean]], and [[Vietnamese language|Vietnamese]] via their [[kanji]] in the late 19th and early 20th centuries. [10432360] |For example, {{lang|ja|政治}} {{transl|ja|''seiji''}} ("politics"), and {{lang|ja|化学}} {{transl|ja|''kagaku''}} ("chemistry") are words derived from Chinese roots that were first created and used by the Japanese, and only later borrowed into Chinese and other East Asian languages. [10432370] |As a result, Japanese, Chinese, Korean, and Vietnamese share a large common corpus of vocabulary in the same way a large number of Greek- and Latin-derived words are shared among modern European languages, although many academic words formed from such roots were certainly coined by native speakers of other languages, such as English. [10432380] |In the past few decades, {{transl|ja|''[[wasei-eigo]]''}} (made-in-Japan English) has become a prominent phenomenon. [10432390] |Words such as {{transl|ja|''wanpatān''}} {{lang|ja|ワンパターン}} (< ''one'' + ''pattern'', "to be in a rut", "to have a one-track mind") and {{transl|ja|''sukinshippu''}} {{lang|ja|スキンシップ}} (< ''skin'' + ''-ship'', "physical contact"), although coined by compounding English roots, are nonsensical in most non-Japanese contexts; exceptions exist in nearby languages such as Korean however, which often use words such as skinship and rimokon (remote control) in the same way as in Japanese. [10432400] |Additionally, many native Japanese words have become commonplace in English, due to the popularity of many Japanese cultural exports. [10432410] |Words such as [[futon]], [[haiku]], [[judo]], [[kamikaze]], [[karaoke]], [[karate]], [[ninja]], [[origami]], [[rickshaw]] (from {{lang|ja|人力車}} {{transl|ja|''jinrikisha''}}), [[samurai]], [[sayonara]], [[sumo]], [[sushi]], [[tsunami]], [[tycoon]] and many others have become part of the English language. [10432420] |See [[list of English words of Japanese origin]] for more. [10432430] |== Writing system == [10432440] |Literacy was introduced to Japan in the form of the [[Chinese writing system]], by way of [[Baekje]] before the 5th century. [10432450] |Using this language, the Japanese emperor [[Emperor Yūryaku|Yūryaku]] sent a letter to a Chinese emperor [[Emperor Shun of Liu Song|Liu Song]] in 478 CE. [10432460] |After the ruin of Baekje, Japan invited scholars from China to learn more of the Chinese writing system. [10432470] |Japanese Emperors gave an official rank to Chinese scholars (続守言/薩弘格/袁晋卿) and spread the use of Chinese characters from the 7th century to the 8th century. [10432480] |At first, the Japanese wrote in [[Classical Chinese]], with Japanese names represented by characters used for their meanings and not their sounds. [10432490] |Later, during the seventh century CE, the Chinese-sounding phoneme principle was used to write pure Japanese poetry and prose (comparable to Akkadian's retention of Sumerian cuneiform), but some Japanese words were still written with characters for their meaning and not the original Chinese sound. [10432500] |This is when the history of Japanese as a written language begins in its own right. [10432510] |By this time, the Japanese language was already distinct from the [[Ryukyuan languages]]. [10432520] |The Korean settlers and their descendants used Kudara-on or Baekje pronunciation (百済音), which was also called Tsushima-pronunciation (対馬音) or [[Go-on]] (呉音). [10432530] |An example of this mixed style is the [[Kojiki]], which was written in 712 AD. [10432540] |They then started to use Chinese characters to write Japanese in a style known as {{transl|ja|''man'yōgana''}}, a syllabic script which used Chinese characters for their sounds in order to transcribe the words of Japanese speech syllable by syllable. [10432550] |Over time, a writing system evolved. [10432560] |[[Chinese characters]] ([[kanji]]) were used to write either words borrowed from Chinese, or Japanese words with the same or similar meanings. [10432570] |Chinese characters were also used to write grammatical elements, were simplified, and eventually became two syllabic scripts: [[hiragana]] and [[katakana]]. [10432580] |Modern Japanese is written in a mixture of three main systems: [[kanji]], characters of Chinese origin used to represent both Chinese [[loanword]]s into Japanese and a number of native Japanese [[morpheme]]s; and two [[syllabary|syllabaries]]: [[hiragana]] and [[katakana]]. [10432590] |The [[Latin alphabet]] is also sometimes used. [10432600] |Arabic numerals are much more common than the kanji when used in counting, but kanji numerals are still used in compounds, such as {{lang|ja|統一}} {{transl|ja|''tōitsu''}} ("unification"). [10432610] |''[[Hiragana]]'' are used for words without kanji representation, for words no longer written in kanji, and also following kanji to show conjugational endings. [10432620] |Because of the way verbs (and adjectives) in Japanese are [[conjugated]], kanji alone cannot fully convey Japanese tense and mood, as kanji cannot be subject to variation when written without losing its meaning. [10432630] |For this reason, hiragana are suffixed to the ends of kanji to show verb and adjective conjugations. [10432640] |Hiragana used in this way are called [[okurigana]]. [10432650] |Hiragana are also written in a superscript called [[furigana]] above or beside a kanji to show the proper reading. [10432660] |This is done to facilitate learning, as well as to clarify particularly old or obscure (or sometimes invented) readings. [10432670] |''[[Katakana]]'', like hiragana, are a syllabary; katakana are primarily used to write foreign words, plant and animal names, and for emphasis. [10432680] |For example "Australia" has been adapted as {{transl|ja|''Ōsutoraria''}} ({{lang|ja|オーストラリア}}), and "supermarket" has been adapted and shortened into {{transl|ja|''sūpā''}} ({{lang|ja|スーパー}}). [10432690] |The [[Latin alphabet]] (in Japanese referred to as [[romaji|''Rōmaji'']] ({{lang|ja|ローマ字}}), literally "Roman letters") is used for some loan words like "CD" and "DVD", and also for some Japanese creations like "Sony". [10432700] |Historically, attempts to limit the number of kanji in use commenced in the mid-19th century, but did not become a matter of government intervention until after Japan's defeat in the Second World War. [10432710] |During the period of post-war occupation (and influenced by the views of some U.S. officials), various schemes including the complete abolition of kanji and exclusive use of rōmaji were considered. [10432720] |The {{transl|ja|''[[jōyō kanji]]''}} ("common use kanji", originally called {{transl|ja|''[[tōyō kanji]]''}} [kanji for general use]) scheme arose as a compromise solution. [10432730] |Japanese students begin to learn kanji from their first year at elementary school. [10432740] |A guideline created by the Japanese Ministry of Education, the list of {{transl|ja|''[[kyōiku kanji]]''}} ("education kanji", a subset of {{transl|ja|''[[jōyō kanji]]''}}), specifies the 1,006 simple characters a child is to learn by the end of sixth grade. [10432750] |Children continue to study another 939 characters in junior high school, covering in total 1,945 {{transl|ja|''[[jōyō kanji]]''}}. [10432760] |The official list of {{transl|ja|''[[jōyō kanji]]''}} was revised several times, but the total number of officially sanctioned characters remained largely unchanged. [10432770] |As for kanji for personal names, the circumstances are somewhat complicated. [10432780] |{{transl|ja|''[[Jōyō kanji]]''}} and {{transl|ja|''[[jinmeiyō kanji]]''}} (an appendix of additional characters for names) are approved for registering personal names. [10432790] |Names containing unapproved characters are denied registration. [10432800] |However, as with the list of {{transl|ja|''[[jōyō kanji]]''}}, criteria for inclusion were often arbitrary and led to many common and popular characters being disapproved for use. [10432810] |Under popular pressure and following a court decision holding the exclusion of common characters unlawful, the list of {{transl|ja|''[[jinmeiyō kanji]]''}} was substantially extended from 92 in 1951 (the year it was first decreed) to 983 in 2004. [10432820] |Furthermore, families whose names are not on these lists were permitted to continue using the older forms. [10432830] |Many writers rely on [[newspaper]] circulation to publish their work with officially sanctioned characters. [10432840] |This distribution method is more efficient than traditional [[pen]] and [[paper]] publications. [10432850] |==Study by non-native speakers== [10432860] |Many major universities throughout the world provide Japanese language courses, and a number of secondary and even primary schools worldwide offer courses in the language. [10432870] |International interest in the Japanese language dates from the 1800s but has become more prevalent following Japan's economic bubble of the 1980s and the global popularity of [[Japanese pop culture]] (such as [[anime]] and [[video games]]) since the 1990s. [10432880] |About 2.3 million people studied the language worldwide in 2003: 900,000 South [[Koreans]], 389,000 [[People's Republic of China|Chinese]], 381,000 [[Australians]], and 140,000 [[United States|Americans]] study Japanese in lower and higher educational institutions. [10432890] |In Japan, more than 90,000 foreign students study at [[List of universities in Japan|Japanese universities]] and Japanese [[language school]]s, including 77,000 Chinese and 15,000 South Koreans in 2003. [10432900] |In addition, local governments and some [[non-profit organisation|NPO]] groups provide free Japanese language classes for foreign residents, including [[Japanese Brazilians]] and foreigners married to Japanese nationals. [10432910] |In the United Kingdom, studies are supported by the [[British Association for Japanese Studies]]. [10432920] |In Ireland, Japanese is offered as a language in the [[Leaving Certificate]] in some schools. [10432930] |The Japanese government provides standardised tests to measure spoken and written comprehension of Japanese for second language learners; the most prominent is the [[Japanese Language Proficiency Test]] (JLPT). [10432940] |The Japanese External Trade Organisation [[JETRO]] organises the ''Business Japanese Proficiency Test'' which tests the learner's ability to understand Japanese in a business setting. [10432950] |When learning Japanese in a college setting, students are usually first taught how to pronounce [[romaji]]. [10432960] |From that point, they are taught the two main syllabaries, with [[kanji]] usually being introduced in the second semester. [10432970] |Focus is usually first on polite (distal) speech, as students that might interact with native speakers would be expected to use. [10432980] |Casual speech and formal speech usually follow polite speech, as well as the usage of honourifics. [10440010] |
Java (programming language)
[10440020] |'''Java''' is a [[programming language]] originally developed by [[Sun Microsystems]] and released in 1995 as a core component of Sun Microsystems' [[Java (Sun)|Java platform]]. [10440030] |The language derives much of its [[Syntax of programming languages|syntax]] from [[C (programming language)|C]] and [[C++]] but has a simpler [[object model]] and fewer low-level facilities. [10440040] |Java applications are typically [[compiler|compiled]] to [[bytecode]] that can run on any [[Java virtual machine]] (JVM) regardless of [[computer architecture]]. [10440050] |The original and [[reference implementation]] Java [[compiler]]s, virtual machines, and [[library (computing)|class libraries]] were developed by Sun from 1995. [10440060] |As of May 2007, in compliance with the specifications of the [[Java Community Process]], Sun made available most of their Java technologies as [[free software]] under the [[GNU General Public License]]. [10440070] |Others have also developed alternative implementations of these Sun technologies, such as the [[GNU Compiler for Java]] and [[GNU Classpath]]. [10440080] |== History == [10440090] |The Java language was created by [[James Gosling]] in June 1991 for use in one of his many [[set-top box]] projects. [10440100] |The language was initially called ''Oak'', after an [[oak tree]] that stood outside Gosling's office—and also went by the name ''Green''—and ended up later being renamed to ''Java'', from a list of random words. [10440110] |Gosling's goals were to implement a [[virtual machine]] and a language that had a familiar C/C++ style of notation. [10440120] |The first public implementation was Java 1.0 in 1995. [10440130] |It promised "[[Write once, run anywhere|Write Once, Run Anywhere]]" (WORA), providing no-cost runtimes on popular platforms. [10440140] |It was fairly secure and its security was configurable, allowing network and file access to be restricted. [10440150] |Major web browsers soon incorporated the ability to run secure Java ''[[applet]]s'' within web pages. [10440160] |Java quickly became popular. [10440170] |With the advent of ''Java 2'', new versions had multiple configurations built for different types of platforms. [10440180] |For example, ''[[J2EE]]'' was for enterprise applications and the greatly stripped down version ''[[J2ME]]'' was for mobile applications. [10440190] |''[[J2SE]]'' was the designation for the Standard Edition. [10440200] |In 2006, for marketing purposes, new ''J2'' versions were renamed ''Java EE'', ''Java ME'', and ''Java SE'', respectively. [10440210] |In 1997, Sun Microsystems approached the [[International Organization for Standardization#JTC1|ISO/IEC JTC1 standards body]] and later the [[Ecma International]] to formalize Java, but it soon withdrew from the process. [10440220] |Java remains a [[de facto]] standard that is controlled through the [[Java Community Process]]. [10440230] |At one time, Sun made most of its Java implementations available without charge although they were [[proprietary software]]. [10440240] |Sun's revenue from Java was generated by the selling of licenses for specialized products such as the Java Enterprise System. [10440250] |Sun distinguishes between its [[Software Development Kit|Software Development Kit (SDK)]] and [[HotSpot|Runtime Environment (JRE)]] that is a subset of the SDK, the primary distinction being that in the JRE, the compiler, utility programs, and many necessary header files are not present. [10440260] |On [[13 November]] [[2006]], Sun released much of Java as [[free software|free]] and [[open-source software|open-source]] software under the terms of the [[GNU General Public License]] (GPL). [10440270] |On [[8 May]] [[2007]] Sun finished the process, making all of Java's core code free and open-source, aside from a small portion of code to which Sun did not hold the copyright. [10440280] |== Philosophy == [10440290] |=== Primary goals === [10440300] |There were five primary goals in the creation of the Java language: [10440310] |# It should use the [[object-oriented programming]] methodology. [10440320] |# It should allow the same program to be [[execution (computers)|executed]] on multiple [[operating system]]s. [10440330] |# It should contain built-in support for using [[computer network]]s. [10440340] |# It should be designed to execute code from [[remote procedure call|remote source]]s securely. [10440350] |# It should be easy to use by selecting what were considered the good parts of other object-oriented languages. [10440360] |=== Platform independence === [10440370] |One characteristic, [[Cross-platform|platform independence]], means that [[computer program|program]]s written in the Java language must run similarly on any supported hardware/operating-system platform. [10440380] |One should be able to write a program once, compile it once, and run it anywhere. [10440390] |This is achieved by most Java [[compiler]]s by compiling the Java language code ''halfway'' (to [[Java bytecode]]) – simplified machine instructions specific to the Java platform. [10440400] |The code is then run on a [[virtual machine]] (VM), a program written in native code on the host hardware that [[Interpreter (computing)|interprets]] and executes generic Java bytecode. [10440410] |(In some JVM versions, bytecode can also be compiled to native code, either before or during program execution, resulting in faster execution.) [10440420] |Further, standardized libraries are provided to allow access to features of the host machines (such as graphics, [[thread (computer science)|threading]] and [[Computer network|networking]]) in unified ways. [10440430] |Note that, although there is an explicit compiling stage, at some point, the Java bytecode is interpreted or converted to native [[machine code]] by the [[Just-in-time compilation|JIT compiler]]. [10440440] |The first implementations of the language used an interpreted virtual machine to achieve [[Porting|portability]]. [10440450] |These implementations produced programs that ran slower than programs compiled to native executables, for instance written in C or C++, so the language suffered a reputation for poor performance. [10440460] |More recent JVM implementations produce programs that run significantly faster than before, using multiple techniques. [10440470] |One technique, known as ''just-in-time compilation'' (JIT), translates the Java bytecode into native code at the time that the program is run, which results in a program that executes faster than interpreted code but also incurs compilation overhead during execution. [10440480] |More sophisticated VMs use ''[[dynamic recompilation]]'', in which the VM can analyze the behavior of the running program and selectively recompile and optimize critical parts of the program. [10440490] |Dynamic recompilation can achieve optimizations superior to static compilation because the dynamic compiler can base optimizations on knowledge about the runtime environment and the set of loaded classes, and can identify the ''hot spots'' (parts of the program, often inner loops, that take up the most execution time). [10440500] |JIT compilation and dynamic recompilation allow Java programs to take advantage of the speed of native code without losing portability. [10440510] |Another technique, commonly known as ''static compilation'', is to compile directly into native code like a more traditional compiler. [10440520] |Static Java compilers, such as [[GCJ]], translate the Java language code to native [[object code]], removing the intermediate bytecode stage. [10440530] |This achieves good performance compared to interpretation, but at the expense of portability; the output of these compilers can only be run on a single [[Computer architecture|architecture]]. [10440540] |Some see avoiding the VM in this manner as defeating the point of developing in Java; however it can be useful to provide both a generic [[bytecode]] version, as well as an optimised native code version of an application. [10440550] |=== Implementations === [10440560] |Sun Microsystems officially licenses the Java Standard Edition platform for [[Microsoft Windows]], [[Linux]], and [[Solaris (operating system)|Solaris]]. [10440570] |Through a network of third-party vendors and licensees, alternative Java environments are available for these and other platforms. [10440580] |To qualify as a certified Java licensee, an implementation on any particular platform must pass a rigorous suite of validation and compatibility tests. [10440590] |This method enables a guaranteed level of compliance and platform through a trusted set of commercial and non-commercial partners. [10440600] |Sun's trademark license for usage of the Java brand insists that all implementations be "compatible". [10440610] |This resulted in a legal dispute with [[Microsoft]] after Sun claimed that the Microsoft implementation did not support the [[Java remote method invocation|RMI]] and [[Java Native Interface|JNI]] interfaces and had added platform-specific features of their own. [10440620] |Sun sued in 1997, and in 2001 won a settlement of $20 million as well as a court order enforcing the terms of the license from Sun. [10440630] |As a result, Microsoft no longer ships Java with [[Microsoft Windows|Windows]], and in recent versions of Windows, [[Internet Explorer]] cannot support Java applets without a third-party plugin. [10440640] |However, Sun and others have made available Java run-time systems at no cost for those and other versions of Windows. [10440650] |Platform-independent Java is essential to the [[Java Enterprise Edition]] strategy, and an even more rigorous validation is required to certify an implementation. [10440660] |This environment enables portable server-side applications, such as [[Web service]]s, [[servlet]]s, and [[Enterprise JavaBean]]s, as well as with [[Embedded system]]s based on [[OSGi]], using [[Embedded Java]] environments. [10440670] |Through the new [[GlassFish]] project, Sun is working to create a fully functional, unified [[open-source]] implementation of the Java EE technologies. [10440680] |=== Automatic memory management === [10440690] |One of the ideas behind Java's automatic memory management model is that programmers be spared the burden of having to perform manual memory management. [10440700] |In some languages the programmer allocates memory for the creation of objects stored on the [[heap]] and the responsibility of later deallocating that memory also resides with the programmer. [10440710] |If the programmer forgets to deallocate memory or writes code that fails to do so, a [[memory leak]] occurs and the program can consume an arbitrarily large amount of memory. [10440720] |Additionally, if the program attempts to deallocate the region of memory more than once, the result is undefined and the program may become unstable and may crash. [10440730] |Finally, in non garbage collected environments, there is a certain degree of overhead and complexity of user-code to track and finalize allocations. [10440740] |Often developers may box themselves into certain designs to provide reasonable assurances that memory leaks will not occur. [10440750] |In Java, this potential problem is avoided by [[automatic garbage collection]]. [10440760] |The programmer determines when objects are created, and the Java runtime is responsible for managing the [[object lifetime|object's lifecycle]]. [10440770] |The program or other objects can reference an object by holding a reference to it (which, from a low-level point of view, is its address on the heap). [10440780] |When no references to an object remain, the [[unreachable object]] is eligible for release by the Java garbage collector - it may be freed automatically by the garbage collector at any time. [10440790] |Memory leaks may still occur if a programmer's code holds a reference to an object that is no longer needed—in other words, they can still occur but at higher conceptual levels. [10440800] |The use of garbage collection in a language can also affect programming paradigms. [10440810] |If, for example, the developer assumes that the cost of memory allocation/recollection is low, they may choose to more freely construct objects instead of pre-initializing, holding and reusing them. [10440820] |With the small cost of potential performance penalties (inner-loop construction of large/complex objects), this facilitates thread-isolation (no need to synchronize as different threads work on different object instances) and data-hiding. [10440830] |The use of transient immutable value-objects minimizes side-effect programming. [10440840] |Comparing Java and [[C++]], it is possible in C++ to implement similar functionality (for example, a memory management model for specific classes can be designed in C++ to improve speed and lower memory fragmentation considerably), with the possible cost of adding comparable runtime overhead to that of Java's garbage collector, and of added development time and application complexity if one favors manual implementation over using an existing third-party library. [10440850] |In Java, garbage collection is built-in and virtually invisible to the developer. [10440860] |That is, developers may have no notion of when garbage collection will take place as it may not necessarily correlate with any actions being explicitly performed by the code they write. [10440870] |Depending on intended application, this can be beneficial or disadvantageous: the programmer is freed from performing low-level tasks, but at the same time loses the option of writing lower level code. [10440880] |Additionally, the garbage collection capability demands some attention to tuning the JVM, as large heaps will cause apparently random stalls in performance. [10440890] |Java does not support [[pointer (computing)|pointer arithmetic]] as is supported in, for example, C++. [10440900] |This is because the garbage collector may relocate referenced objects, invalidating such pointers. [10440910] |Another reason that Java forbids this is that type safety and security can no longer be guaranteed if arbitrary manipulation of pointers is allowed. [10440920] |== Syntax == [10440930] |The syntax of Java is largely derived from [[C++]]. [10440940] |Unlike C++, which combines the syntax for structured, generic, and object-oriented programming, Java was built exclusively as an object oriented language. [10440950] |As a result, almost everything is an object and all code is written inside a class. [10440960] |The exceptions are the intrinsic data types (ordinal and real numbers, boolean values, and characters), which are not classes for performance reasons. [10440970] |=== Hello, world program === [10440980] |This is a minimal [[Hello world program]] in Java with [[syntax highlighting]]: [10440990] | // Hello.java public class Hello { public static void main(String[] args) { System.out.println("Hello, world!"); } } [10441000] |To execute a Java program, the code is saved as a file named Hello.java. [10441010] |It must first be compiled into bytecode using a [[Java compiler]], which produces a file named Hello.class. [10441020] |This class is then ''launched''. [10441030] |The above example merits a bit of explanation. [10441040] |* All executable statements in Java are written inside a class, including stand-alone programs. [10441050] |* Source files are by convention named the same as the class they contain, appending the mandatory suffix ''.java''. [10441060] |A '''class''' that is declared '''public''' is required to follow this convention. [10441070] |(In this case, the class '''Hello''' is public, therefore the source must be stored in a file called ''Hello.java''). [10441080] |* The compiler will generate a class file for each class defined in the source file. [10441090] |The name of the class file is the name of the class, with ''.class'' appended. [10441100] |For class file generation, anonymous classes are treated as if their name was the concatenation of the name of their enclosing class, a ''$'', and an integer. [10441110] |* The [[Java keywords|keyword]] '''public''' denotes that a method can be called from code in other classes, or that a class may be used by classes outside the class hierarchy. [10441120] |* The keyword '''static''' indicates that the method is a [[class method|static method]], associated with the class rather than object instances. [10441130] |* The keyword '''void''' indicates that the main method does not return any value to the caller. [10441140] |* The method name "main" is not a keyword in the Java language. [10441150] |It is simply the name of the method the Java launcher calls to pass control to the program. [10441160] |Java classes that run in managed environments such as applets and [[Enterprise Java Beans]] do not use or need a main() method. [10441170] |* The main method must accept an [[array]] of '''{{Javadoc:SE|java/lang|String}}''' objects. [10441180] |By convention, it is referenced as '''args''' although any other legal identifier name can be used. [10441190] |Since Java 5, the main method can also use [[varargs|variable arguments]], in the form of public static void main(String... args), allowing the main method to be invoked with an arbitrary number of String arguments. [10441200] |The effect of this alternate declaration is semantically identical (the args parameter is still an array of String objects), but allows an alternate syntax for creating and passing the array. [10441210] |* The Java launcher launches Java by loading a given class (specified on the command line) and starting its public static void main(String[]) method. [10441220] |Stand-alone programs must declare this method explicitly. [10441230] |The String[] args parameter is an [[array]] of {{Javadoc:SE|java/lang|String}} objects containing any arguments passed to the class. [10441240] |The parameters to main are often passed by means of a [[command line]]. [10441250] |* The printing facility is part of the Java standard library: The '''{{Javadoc:SE|java/lang|System}}''' class defines a public static field called '''{{Javadoc:SE|name=out|java/lang|System|out}}'''. [10441260] |The out object is an instance of the {{Javadoc:SE|java/io|PrintStream}} class and provides the method '''{{Javadoc:SE|name=println(String)|java/io|PrintStream|println(java.lang.String)}}''' for displaying data to the screen while creating a new line ([[standard streams|standard out]]). [10441270] |=== A more comprehensive example === [10441280] | // OddEven.java import javax.swing.JOptionPane;public class OddEven { public static void main(String[] args) { // This is the main method.It gets called when this class is run through a Java interpreter.OddEven number = new OddEven(); /* This line of code creates a new instance of this class called "number" and * initializes it, and the next line of code calls the "showDialog()" method, * which brings up a prompt to ask you for a number */ number.showDialog(); } private int input; // A whole number("int" means integer) // "input" is the number that the user gives to the computer public OddEven() { /* This is the constructor method.It gets called when an object of the OddEven type * is created. */ } public void showDialog() { try /* This makes sure nothing goes wrong.If something does, * the interpreter skips to "catch" to see what it should do. */ { input = Integer.parseInt(JOptionPane.showInputDialog("Please Enter A Number")); calculate(); /* * The code above brings up a JOptionPane, which is a dialog box * The String returned by the "showInputDialog()" method is converted into * an integer, making the program treat it as a number instead of a word. * After that, this method calls a second method, calculate() that will * display either "Even" or "Odd." */ } catch (NumberFormatException e) /* This means that there was a problem with the format of the number * (Like if someone were to type in 'Hello world' instead of a number). */ { System.err.println("ERROR: Invalid input.Please type in a numerical value."); } } private void calculate() { if (input % 2 == 0) System.out.println("Even"); /* When this gets called, it sends a message to the interpreter. * The interpreter usually shows it on the command prompt (For Windows users) * or the terminal (For Linux users).(Assuming it's open) */ else System.out.println("Odd"); } } [10441290] |* The '''[[Java keywords#import|import]]''' statement imports the '''{{Javadoc:SE|javax/swing|JOptionPane}}''' class from the '''{{Javadoc:SE|package=javax.swing|javax/swing}}''' package. [10441300] |* The '''OddEven''' class declares a single '''[[Java keywords#private|private]]''' [[field (computer science)|field]] of type '''int''' named '''input'''. [10441310] |Every instance of the OddEven class has its own copy of the input field. [10441320] |The private declaration means that no other class can access (read or write) the input field. [10441330] |* '''OddEven()''' is a '''public''' [[constructor (computer science)|constructor]]. [10441340] |Constructors have the same name as the enclosing class they are declared in, and unlike a method, have no [[return type]]. [10441350] |A constructor is used to initialize an [[object (computer science)|object]] that is a newly created instance of the class. [10441360] |The dialog returns a String that is converted to an int by the '''{{Javadoc:SE|java/lang|Integer|parseInt(String)}}''' method. [10441370] |* The '''calculate()''' method is declared without the static keyword. [10441380] |This means that the method is invoked using a specific instance of the OddEven class. [10441390] |(The [[reference (computer science)|reference]] used to invoke the method is passed as an undeclared parameter of type OddEven named '''[[Java keywords#this|this]]'''.) [10441400] |The method tests the expression input % 2 == 0 using the '''[[Java keywords#if|if]]''' keyword to see if the remainder of dividing the input field belonging to the instance of the class by two is zero. [10441410] |If this expression is true, then it prints '''Even'''; if this expression is false it prints '''Odd'''. [10441420] |(The input field can be equivalently accessed as this.input, which explicitly uses the undeclared this parameter.) [10441430] |* '''OddEven number = new OddEven();''' declares a local object [[reference (computer science)|reference]] variable in the main method named number. [10441440] |This variable can hold a reference to an object of type OddEven. [10441450] |The declaration initializes number by first creating an instance of the OddEven class, using the '''[[Java keywords#new|new]]''' keyword and the OddEven() constructor, and then assigning this instance to the variable. [10441460] |* The statement '''number.showDialog();''' calls the calculate method. [10441470] |The instance of OddEven object referenced by the number [[local variable]] is used to invoke the method and passed as the undeclared this parameter to the calculate method. [10441480] |* For simplicity, [[error handling]] has been ignored in this example. [10441490] |Entering a value that is not a number will cause the program to crash. [10441500] |This can be avoided by catching and handling the {{Javadoc:SE|java/lang|NumberFormatException}} thrown by Integer.parseInt(String). [10441510] |=== Applet === [10441520] |Java applets are programs that are embedded in other applications, typically in a Web page displayed in a [[Web browser]]. [10441530] | // Hello.java import java.applet.Applet; import java.awt.Graphics;public class Hello extends Applet { public void paint(Graphics gc) { gc.drawString("Hello, world!", 65, 95); } } [10441540] |The '''import''' statements direct the [[Java compiler]] to include the '''{{Javadoc:SE|package=java.applet|java/applet|Applet}}''' and '''{{Javadoc:SE|package=java.awt|java/awt|Graphics}}''' classes in the compilation. [10441550] |The import statement allows these classes to be referenced in the [[source code]] using the ''simple class name'' (i.e. Applet) instead of the ''fully qualified class name'' (i.e. java.applet.Applet). [10441560] |The Hello class '''extends''' ([[subclass (computer science)|subclasses]]) the '''Applet''' class; the Applet class provides the framework for the host application to display and control the [[Object lifetime|lifecycle]] of the applet. [10441570] |The Applet class is an [[Abstract Windowing Toolkit]] (AWT) {{Javadoc:SE|java/awt|Component}}, which provides the applet with the capability to display a [[graphical user interface]] (GUI) and respond to user [[event-driven programming|events]]. [10441580] |The Hello class [[method overriding (programming)|overrides]] the '''{{Javadoc:SE|name=paint(Graphics)|java/awt|Container|paint(java.awt.Graphics)}}''' method inherited from the {{Javadoc:SE|java/awt|Container}} [[superclass (computer science)|superclass]] to provide the code to display the applet. [10441590] |The paint() method is passed a '''Graphics''' object that contains the graphic context used to display the applet. [10441600] |The paint() method calls the graphic context '''{{Javadoc:SE|name=drawString(String, int, int)|java/awt|Graphics|drawString(java.lang.String,%20int,%20int)}}''' method to display the '''"Hello, world!"''' string at a [[pixel]] offset of ('''65, 95''') from the upper-left corner in the applet's display. [10441610] | Hello World Applet [10441620] |An applet is placed in an [[HTML]] document using the '''''' [[HTML element]]. [10441630] |The applet tag has three attributes set: '''code="Hello"''' specifies the name of the Applet class and '''width="200" height="200"''' sets the pixel width and height of the applet. [10441640] |Applets may also be embedded in HTML using either the object or embed element, although support for these elements by Web browsers is inconsistent. [10441650] |However, the applet tag is deprecated, so the object tag is preferred where supported. [10441660] |The host application, typically a Web browser, instantiates the '''Hello''' applet and creates an {{Javadoc:SE|java/applet|AppletContext}} for the applet. [10441670] |Once the applet has initialized itself, it is added to the AWT display hierarchy. [10441680] |The paint method is called by the AWT [[event dispatching thread]] whenever the display needs the applet to draw itself. [10441690] |=== '''Servlet''' === [10441700] |Java Servlet technology provides Web developers with a simple, consistent mechanism for extending the functionality of a Web server and for accessing existing business systems. [10441710] |Servlets are [[server-side]] Java EE components that generate responses (typically [[HTML]] pages) to requests (typically [[HTTP]] requests) from [[client (computing)|client]]s. [10441720] |A servlet can almost be thought of as an applet that runs on the server side—without a face. [10441730] | // Hello.java import java.io.*; import javax.servlet.*;public class Hello extends GenericServlet { public void service(ServletRequest request, ServletResponse response) throws ServletException, IOException { response.setContentType("text/html"); final PrintWriter pw = response.getWriter(); pw.println("Hello, world!"); pw.close(); } } [10441740] |The '''import''' statements direct the Java compiler to include all of the public classes and [[interface (Java)|interfaces]] from the '''{{Javadoc:SE|package=java.io|java/io}}''' and '''{{Javadoc:EE|package=javax.servlet|javax/servlet}}''' [[Java package|packages]] in the compilation. [10441750] |The '''Hello''' class '''extends''' the '''{{Javadoc:EE|javax/servlet|GenericServlet}}''' class; the GenericServlet class provides the interface for the [[server (computing)|server]] to forward requests to the servlet and control the servlet's lifecycle. [10441760] |The Hello class overrides the '''{{Javadoc:EE|name=service(ServletRequest, ServletResponse)|javax/servlet|Servlet|service(javax.servlet.ServletRequest,javax.servlet.ServletResponse)}}''' method defined by the {{Javadoc:EE|javax/servlet|Servlet}} [[Interface (Java)|interface]] to provide the code for the service request handler. [10441770] |The service() method is passed a '''{{Javadoc:EE|javax/servlet|ServletRequest}}''' object that contains the request from the client and a '''{{Javadoc:EE|javax/servlet|ServletResponse}}''' object used to create the response returned to the client. [10441780] |The service() method declares that it '''throws''' the [[exception handling|exceptions]] {{Javadoc:EE|javax/servlet|ServletException}} and {{Javadoc:SE|java/io|IOException}} if a problem prevents it from responding to the request. [10441790] |The '''{{Javadoc:EE|name=setContentType(String)|javax/servlet|ServletResponse|setContentType(java.lang.String)}}''' method in the response object is called to set the [[MIME]] content type of the returned data to '''"text/html"'''. [10441800] |The '''{{Javadoc:EE|name=getWriter()|javax/servlet|ServletResponse|getWriter()}}''' method in the response returns a '''{{Javadoc:SE|java/io|PrintWriter}}''' object that is used to write the data that is sent to the client. [10441810] |The '''{{Javadoc:SE|name=println(String)|java/io|PrintWriter|println(java.lang.String)}}''' method is called to write the '''"Hello, world!"''' string to the response and then the '''{{Javadoc:SE|name=close()|java/io|PrintWriter|close()}}''' method is called to close the print writer, which causes the data that has been written to the stream to be returned to the client. [10441820] |=== JavaServer Page === [10441830] |JavaServer Pages (JSPs) are [[server-side]] Java EE components that generate responses, typically [[HTML]] pages, to [[HTTP]] requests from [[client (computing)|client]]s. [10441840] |JSPs embed Java code in an HTML page by using the special [[delimiter]]s <% and %>. [10441850] |A JSP is compiled to a Java ''servlet'', a Java application in its own right, the first time it is accessed. [10441860] |After that, the generated servlet creates the response. [10441870] |=== Swing application === [10441880] |Swing is a graphical user interface [[library (computer science)|library]] for the Java SE platform. [10441890] |This example Swing application creates a single window with "Hello, world!" inside: [10441900] | // Hello.java (Java SE 5) import java.awt.BorderLayout; import javax.swing.*;public class Hello extends JFrame { public Hello() { super("hello"); setDefaultCloseOperation(WindowConstants.EXIT_ON_CLOSE); setLayout(new BorderLayout()); add(new JLabel("Hello, world!")); pack(); }public static void main(String[] args) { new Hello().setVisible(true); } } [10441910] |The first '''import''' statement directs the Java compiler to include the {{Javadoc:SE|java/awt|BorderLayout}} class from the {{Javadoc:SE|package=java.awt|java/awt}} package in the compilation; the second '''import''' includes all of the public classes and interfaces from the '''{{Javadoc:SE|package=javax.swing|javax/swing}}''' package. [10441920] |The '''Hello''' class '''extends''' the '''{{Javadoc:SE|javax/swing|JFrame}}''' class; the JFrame class implements a [[window (computing)|window]] with a [[title bar]] and a close [[Widget (computing)|control]]. [10441930] |The '''Hello()''' [[constructor (computer science)|constructor]] initializes the frame by first calling the superclass constructor, passing the parameter "hello", which is used as the window's title. [10441940] |It then calls the '''{{Javadoc:SE|name=setDefaultCloseOperation(int)|javax/swing|JFrame|setDefaultCloseOperation(int)}}''' method inherited from JFrame to set the default operation when the close control on the title bar is selected to '''{{Javadoc:SE|javax/swing|WindowConstants|EXIT_ON_CLOSE}}''' — this causes the JFrame to be disposed of when the frame is closed (as opposed to merely hidden), which allows the JVM to exit and the program to terminate. [10441950] |Next, the [[Layout manager|layout]] of the frame is set to a BorderLayout; this tells Swing how to arrange the components that will be added to the frame. [10441960] |A '''{{Javadoc:SE|javax/swing|JLabel}}''' is created for the string '''"Hello, world!"''' and the '''{{Javadoc:SE|name=add(Component)|java/awt|Container|add(java.awt.Component)}}''' method inherited from the {{Javadoc:SE|java/awt|Container}} superclass is called to add the label to the frame. [10441970] |The '''{{Javadoc:SE|name=pack()|java/awt|Window|pack()}}''' method inherited from the {{Javadoc:SE|java/awt|Window}} superclass is called to size the window and lay out its contents, in the manner indicated by the BorderLayout. [10441980] |The '''main()''' method is called by the JVM when the program starts. [10441990] |It [[Instance (programming)|instantiates]] a new '''Hello''' frame and causes it to be displayed by calling the '''{{Javadoc:SE|name=setVisible(boolean)|java/awt|Component|setVisible(boolean)}}''' method inherited from the {{Javadoc:SE|java/awt|Component}} superclass with the boolean parameter '''true'''. [10442000] |Note that once the frame is displayed, exiting the main method does not cause the program to terminate because the AWT [[event dispatching thread]] remains active until all of the Swing top-level windows have been disposed. [10442010] |== Criticism == [10442020] |[[Java performance|Java's performance]] has improved substantially since the early versions, and performance of [[JIT compiler]]s relative to native compilers has in some tests been shown to be quite similar. [10442030] |The performance of the compilers does not necessarily indicate the performance of the compiled code; only careful testing can reveal the true performance issues in any system. [10442040] |The default [[look and feel]] of [[Graphical User Interface|GUI]] applications written in Java using the [[Swing (Java)|Swing]] toolkit is very different from native applications. [10442050] |It is possible to specify a different look and feel through the [[pluggable look and feel]] system of Swing. [10442060] |Clones of [[Microsoft Windows|Windows]], [[GTK]] and [[Motif (widget toolkit)|Motif]] are supplied by Sun. [10442070] |[[Apple Computer|Apple]] also provides an [[Aqua (theme)|Aqua]] look and feel for [[Mac OS X]]. [10442080] |Though prior implementations of these looks and feels have been considered lacking, Swing in Java SE 6 addresses this problem by using more native [[Widget (computing)|widget]] drawing routines of the underlying platforms. [10442090] |Alternatively, third party toolkits such as [[wx4j]], [[Qt (toolkit)|Qt Jambi]] or [[Standard Widget Toolkit|SWT]] may be used for increased integration with the native windowing system. [10442100] |As in C++ and some other object-oriented languages, variables of Java's [[primitive type]]s were not originally objects. [10442110] |Values of primitive types are either stored directly in fields (for objects) or on the [[Stack-based memory allocation|stack]] (for methods) rather than on the heap, as is the common case for objects (but see [[Escape analysis]]). [10442120] |This was a conscious decision by Java's designers for performance reasons. [10442130] |Because of this, Java was not considered to be a pure object-oriented programming language. [10442140] |However, as of Java 5.0, [[Object type|autoboxing]] enables programmers to write as if primitive types are their wrapper classes, with their object-oriented counterparts representing classes of their own, and freely interchange between them for improved flexibility. [10442150] |Java suppresses several features (such as [[operator overloading]] and [[multiple inheritance]]) for ''classes'' in order to simplify the language, to "save the programmers from themselves", and to prevent possible errors and anti-pattern design. [10442160] |This has been a source of criticism, relating to a lack of low-level features, but some of these limitations may be worked around. [10442170] |Java ''interfaces'' have always had multiple inheritance. [10442180] |== Resources == [10442190] |=== Java Runtime Environment === [10442200] |The Java Runtime Environment, or ''JRE'', is the software required to run any [[Application software|application]] deployed on the Java Platform. [10442210] |[[End-user]]s commonly use a JRE in [[Software package (programming)|software package]]s and Web browser [[plugin]]s. [10442220] |Sun also distributes a superset of the JRE called the Java 2 [[SDK]] (more commonly known as the JDK), which includes development tools such as the [[Java compiler]], [[Javadoc]], [[JAR (file format)|Jar]] and [[debugger]]. [10442230] |One of the unique advantages of the concept of a runtime engine is that errors (exceptions) should not 'crash' the system. [10442240] |Moreover, in runtime engine environments such as Java there exist tools that attach to the runtime engine and every time that an exception of interest occurs they record debugging information that existed in memory at the time the exception was thrown (stack and heap values). [10442250] |These [[Automated Exception Handling]] tools provide 'root-cause' information for exceptions in Java programs that run in production, testing or development environments. [10442260] |==== Components ==== [10442270] |* Java [[Library (computer science)|libraries]] are the compiled [[byte code]]s of [[source code]] developed by the JRE implementor to support application development in Java. [10442280] |Examples of these libraries are: [10442290] |** The core libraries, which include: [10442300] |*** Collection libraries that implement [[data structure]]s such as [[List (computing)|lists]], [[associative array|dictionaries]], [[tree structure|trees]] and [[Set (computer science)|sets]] [10442310] |*** [[XML]] Processing (Parsing, Transforming, Validating) libraries [10442320] |*** Security [10442330] |*** [[i18n|Internationalization and localization]] libraries [10442340] |** The integration libraries, which allow the application writer to communicate with external systems. [10442350] |These libraries include: [10442360] |*** The [[Java Database Connectivity]] (JDBC) [[Application Programming Interface|API]] for database access [10442370] |*** [[Java Naming and Directory Interface]] (JNDI) for lookup and discovery [10442380] |*** [[Java remote method invocation|RMI]] and [[CORBA]] for distributed application development [10442390] |** [[User Interface]] libraries, which include: [10442400] |*** The (heavyweight, or [[native mode|native]]) [[Abstract Windowing Toolkit]] (AWT), which provides [[graphical user interface|GUI]] components, the means for laying out those components and the means for handling events from those components [10442410] |*** The (lightweight) [[Swing (Java)|Swing]] libraries, which are built on AWT but provide (non-native) implementations of the AWT widgetry [10442420] |*** APIs for audio capture, processing, and playback [10442430] |* A platform dependent implementation of [[Java virtual machine]] (JVM) that is the means by which the byte codes of the Java libraries and third party applications are executed [10442440] |* Plugins, which enable [[Java applet|applet]]s to be run in [[Web browser]]s [10442450] |* [[Java Web Start]], which allows Java applications to be efficiently distributed to [[end user]]s across the [[Internet]] [10442460] |* Licensing and documentation [10442470] |=== APIs === [10442480] |Sun has defined three platforms targeting different application environments and segmented many of its [[application programming interface|API]]s so that they belong to one of the platforms. [10442490] |The platforms are: [10442500] |* [[Java Platform, Micro Edition]] (Java ME) — targeting environments with limited resources, [10442510] |* [[Java Platform, Standard Edition]] (Java SE) — targeting workstation environments, and [10442520] |* [[Java Platform, Enterprise Edition]] (Java EE) — targeting large distributed enterprise or Internet environments. [10442530] |The [[Class (computer science)|classes]] in the Java APIs are organized into separate groups called [[Java package|packages]]. [10442540] |Each package contains a set of related [[Interface (Java)|interface]]s, classes and [[exception handling|exceptions]]. [10442550] |Refer to the separate platforms for a description of the packages available. [10442560] |The set of APIs is controlled by Sun Microsystems in cooperation with others through the [[Java Community Process]] program. [10442570] |Companies or individuals participating in this process can influence the design and development of the APIs. [10442580] |This process has been a subject of controversy. [10450010] |
Language
[10450020] |A '''language''' is a dynamic set of visual, auditory, or tactile [[symbol]]s of [[communication]] and the elements used to manipulate them. [10450030] |''Language'' can also refer to the use of such systems as a general [[phenomenon]]. [10450040] |Language is considered to be an exclusively human mode of communication; although other animals make use of quite sophisticated communicative systems, none of these are known to make use of all of the properties that linguists use to define language. [10450050] |== Properties of language == [10450060] |A set of agreed-upon symbols is only one feature of language; all languages must define the structural relationships between these symbols in a system of [[grammar]]. [10450070] |Rules of grammar are what distinguish language from other forms of communication. [10450080] |They allow a finite set of symbols to be manipulated to create a potentially infinite number of grammatical utterances. [10450090] |Another property of language is that its symbols are [[arbitrary]]. [10450100] |Any concept or grammatical rule can be mapped onto a symbol. [10450110] |Most languages make use of sound, but the combinations of sounds used do not have any ''inherent'' meaning – they are merely an agreed-upon convention to represent a certain thing by users of that language. [10450120] |For instance, there is nothing about the [[Spanish language|Spanish]] [[word]] ''{{lang|es|nada}}'' itself that forces Spanish speakers to convey the idea of "nothing". [10450130] |Another set of sounds (for example, the English word ''nothing'') could equally be used to represent the same concept, but all Spanish speakers have acquired or learned to correlate this meaning for this particular sound pattern. [10450140] |For [[Slovene language|Slovenian]], [[Croatian language|Croatian]], [[Serbian language|Serbian/Kosovan]] or [[Bosnian language|Bosnian]] speakers on the other hand, ''{{lang|hr|nada}}'' means something else; it means "hope". [10450150] |==The study of language== [10450160] |===Linguistics=== [10450170] |[[Linguistics]] is the [[science|scientific]] and [[philosophy|philosophical]] study of language, encompassing a number of sub-fields. [10450180] |At the core of [[theoretical linguistics]] are the study of language structure ([[grammar]]) and the study of meaning ([[semantics]]). [10450190] |The first of these encompasses [[morphology (linguistics)|morphology]] (the formation and composition of [[word]]s), [[syntax]] (the rules that determine how words combine into [[phrase]]s and [[Sentence (linguistics)|sentences]]) and [[phonology]] (the study of sound systems and abstract sound units). [10450200] |[[Phonetics]] is a related branch of linguistics concerned with the actual properties of speech sounds ([[phone]]s), non-speech sounds, and how they are produced and [[speech perception|perceived]]. [10450210] |[[Theoretical linguistics]] is mostly concerned with developing models of linguistic knowledge. [10450220] |The fields that are generally considered as the core of theoretical linguistics are [[syntax]], [[phonology]], [[Morphology (linguistics)|morphology]], and [[semantics]]. [10450230] |[[Applied linguistics]] attempts to put linguistic theories into practice through areas like [[translation]], [[Stylistics (linguistics)|stylistics]], [[literary criticism]] and [[Literary theory|theory]], [[discourse analysis]], [[speech therapy]], speech pathology and [[Second language acquisition|foreign language teaching]]. [10450240] |===History=== [10450250] |The historical record of [[linguistics]] begins in [[India]] with [[Pāṇini]], the [[5th century BCE]] grammarian who formulated 3,959 rules of [[Sanskrit language|Sanskrit]] [[morphology (linguistics)|morphology]], known as the ''{{IAST|[[Aṣṭādhyāyī]]}}'' (अष्टाध्यायी) and with [[Tolkāppiyar]], the [[3rd century BCE]] grammarian of the [[Tamil language|Tamil]] work [[Tolkāppiyam]]. grammar is highly systematized and technical. [10450260] |Inherent in its analytic approach are the concepts of the [[phoneme]], the [[morpheme]], and the [[Root (linguistics)|root]]; Western linguists only recognized the phoneme some two millennia later. [10450270] |Tolkāppiyar's work is perhaps the first to describe [[articulatory phonetics]] for a language. [10450280] |Its classification of the alphabet into [[consonant]]s and [[vowel]]s, and elements like nouns, verbs, vowels, and consonants, which he put into classes, were also breakthroughs at the time. [10450290] |In the [[Middle East]], the [[Persian Empire|Persian]] linguist [[Sibawayh]] (سیبویه) made a detailed and professional description of [[Arabic language|Arabic]] in 760 CE in his monumental work, ''Al-kitab fi al-nahw'' (الكتاب في النحو, ''The Book on Grammar''), bringing many [[Linguistics|linguistic]] aspects of language to light. [10450300] |In his book, he distinguished [[phonetics]] from [[phonology]]. [10450310] |Later in the West, the success of [[science]], [[mathematics]], and other [[formal system]]s in the 20th century led many to attempt a formalization of the study of language as a "semantic code". [10450320] |This resulted in the [[academic discipline]] of [[linguistics]], the founding of which is attributed to [[Ferdinand de Saussure]]. [10450330] |In the 20th century, substantial contributions to the understanding of language came from [[Ferdinand de Saussure]], [[Hjelmslev]], [[Émile Benveniste]] and [[Roman Jakobson]], which are characterized as being highly [[systematic]]. [10450340] |== Human languages == [10450350] |Human languages are usually referred to as natural languages, and the science of studying them falls under the purview of [[linguistics]]. [10450360] |A common progression for natural languages is that they are considered to be first spoken, then written, and then an understanding and explanation of their grammar is attempted. [10450370] |Languages live, die, move from place to place, and change with time. [10450380] |Any language that ceases to change or develop is categorized as a [[dead language]]. [10450390] |Conversely, any language that is a ''living language,'' that is, it is in a continuous state of change, is known as a [[modern language]]. [10450400] |Making a principled distinction between one language and another is usually impossible. [10450410] |For instance, there are a few [[dialect]]s of [[German language|German]] similar to some dialects of [[Dutch language|Dutch]]. [10450420] |The transition between languages within the same [[language family]] is sometimes gradual (see [[dialect continuum]]). [10450430] |Some like to make parallels with [[biology]], where it is not possible to make a well-defined distinction between one species and the next. [10450440] |In either case, the ultimate difficulty may stem from the [[interaction]]s between languages and [[population]]s. [10450450] |(See [[Dialect]] or [[August Schleicher]] for a longer discussion.) [10450460] |The concepts of [[Ausbausprache - Abstandsprache - Dachsprache|Ausbausprache, Abstandsprache and Dachsprache]] are used to make finer distinctions about the degrees of difference between languages or dialects. [10450470] |==Artificial languages== [10450480] |=== Constructed languages === [10450490] |Some individuals and groups have constructed their own artificial languages, for practical, experimental, personal, or ideological reasons. [10450500] |International auxiliary languages are generally constructed languages that strive to be easier to learn than natural languages; other constructed languages strive to be more logical ("loglangs") than natural languages; a prominent example of this is [[Lojban]]. [10450510] |Some writers, such as [[J. R. R. Tolkien]], have created fantasy languages, for literary, [[Artistic language|artistic]] or personal reasons. [10450520] |The fantasy language of the [[Klingon]] race has in recent years been developed by fans of the Star Trek series, including a vocabulary and grammar. [10450530] |Constructed languages are not necessarily restricted to the properties shared by natural languages. [10450540] |This part of ISO 639 also includes identifiers that denote constructed (or artificial) languages. [10450550] |In order to qualify for inclusion the language must have a literature and it must be designed for the purpose of human communication. [10450560] |Specifically excluded are reconstructed languages and computer programming languages. [10450570] |===International auxiliary languages=== [10450580] |Some languages, most constructed, are meant specifically for communication between people of different nationalities or language groups as an easy-to-learn second language. [10450590] |Several of these languages have been constructed by individuals or groups. [10450600] |Natural, pre-existing languages may also be used in this way - their developers merely catalogued and standardized their vocabulary and identified their grammatical rules. [10450610] |These languages are called ''naturalistic.'' [10450620] |One such language, [[Latino Sine Flexione]], is a simplified form of Latin. [10450630] |Two others, [[Occidental language|Occidental]] and [[Novial]], were drawn from several Western languages. [10450640] |To date, the most successful auxiliary language is [[Esperanto]], invented by Polish ophthalmologist [[L. L. Zamenhof|Zamenhof]]. [10450650] |It has a relatively large community roughly estimated at about 2 million speakers worldwide, with a large body of literature, songs, and is the only known constructed language to have [[Native Esperanto speakers|native speakers]], such as the Hungarian-born American businessman [[George Soros]]. [10450660] |Other auxiliary languages with a relatively large number of speakers and literature are [[Interlingua]] and [[Ido]]. [10450670] |===Controlled languages=== [10450680] |Controlled natural languages are subsets of natural languages whose grammars and dictionaries have been restricted in order to reduce or eliminate both ambiguity and complexity. [10450690] |The purpose behind the development and implementation of a controlled natural language typically is to aid non-native speakers of a natural language in understanding it, or to ease computer processing of a natural language. [10450700] |An example of a widely used controlled natural language is [[Simplified English]], which was originally developed for [[aerospace]] industry maintenance manuals. [10450710] |== Formal languages == [10450720] |[[Mathematics]] and [[computer science]] use artificial entities called formal languages (including [[programming language]]s and [[markup language]]s, and some that are more theoretical in nature). [10450730] |These often take the form of [[character string]]s, produced by a combination of [[formal grammar]] and semantics of arbitrary complexity. [10450740] |=== Programming languages === [10450750] |A programming language is an extreme case of a formal language that can be used to control the behavior of a machine, particularly a computer, to perform specific tasks. [10450760] |Programming languages are defined using syntactic and semantic rules, to determine structure and meaning respectively. [10450770] |Programming languages are used to facilitate communication about the task of organizing and manipulating information, and to express algorithms precisely. [10450780] |Some authors restrict the term "programming language" to those languages that can express all possible algorithms; sometimes the term "computer language" is used for artificial languages that are more limited. [10450790] |== Animal communication == [10450800] |The term "[[animal language]]s" is often used for non-human languages. [10450810] |Linguists do not consider these to be "language", but describe them as [[animal communication]], because the interaction between animals in such communication is fundamentally different in its underlying principles from human language. [10450820] |Nevertheless, some scholars have tried to disprove this mainstream premise through experiments on training chimpanzees to talk. [10450830] |[[Karl von Frisch]] received the Nobel Prize in 1973 for his proof of the language and dialects of the bees. [10450840] |In several publicized instances, non-human animals have been taught to understand certain features of human language. [10450850] |[[Chimpanzee]]s, [[gorilla]]s, and [[orangutan]]s have been taught hand signs based on [[American Sign Language]]. [10450860] |The [[African Grey Parrot]], which possesses the ability to mimic human speech with a high degree of accuracy, is suspected of having sufficient intelligence to comprehend some of the speech it mimics. [10450870] |Most species of [[parrot]], despite expert mimicry, are believed to have no linguistic comprehension at all. [10450880] |While proponents of animal communication systems have debated levels of [[semantics]], these systems have not been found to have anything approaching human language [[syntax]]. [10460010] |
Language model
[10460020] |A statistical '''language model''' assigns a [[probability]] to a sequence of ''m'' words P(w_1,\ldots,w_m) by means of a [[probability distribution]]. [10460030] |Language modeling is used in many [[natural language processing]] applications such as [[speech recognition]], [[machine translation]], [[part-of-speech tagging]], [[parsing]] and [[information retrieval]]. [10460040] |In [[speech recognition]] and in [[data compression]], such a model tries to capture the properties of a language, and to predict the next word in a speech sequence. [10460050] |When used in information retrieval, a language model is associated with a [[document]] in a collection. [10460060] |With query ''Q'' as input, retrieved documents are ranked based on the probability that the document's language model would generate the terms of the query, ''P(Q|Md)''. [10460070] |Estimating the probability of sequences can become difficult in [[corpora]], in which [[phrase]]s or [[Sentence (linguistics)|sentence]]s can be arbitrarily long and hence some sequences are not observed during [[training]] of the language model ([[data sparseness problem]] of [[overfitting]]). [10460080] |For that reason these models are often approximated using smoothed [[N-gram]] models. [10460090] |== N-gram models == [10460100] |In an n-gram model, the probability P(w_1,\ldots,w_m) of observing the sentence w1,...,wm is approximated as [10460110] | P(w_1,\ldots,w_m) = \prod^m_{i=1} P(w_i|w_1,\ldots,w_{i-1}) \approx \prod^m_{i=1} P(w_i|w_{i-(n-1)},\ldots,w_{i-1}) [10460120] |Here, it is assumed that the probability of observing the ''ith'' word ''wi'' in the context history of the preceding ''i-1'' words can be approximated by the probability of observing it in the shortened context history of the preceding ''n-1'' words (''nth order [[Markov property]]). [10460130] |The conditional probability can be calculated from n-gram frequency counts: P(w_i|w_{i-(n-1)},\ldots,w_{i-1}) = \frac{count(w_{i-(n-1)},\ldots,w_{i-1})}{count(w_{i-(n-1)},w_{i-1},\ldots,w_i)} [10460140] |The words '''bigram''' and '''trigram''' language model denote n-gram language models with ''n=2'' and ''n=3'', respectively. [10460150] |=== Example === [10460160] |In a bigram (n=2) language model, the probability of the sentence ''I saw the red house'' is approximated as P(I,saw,the,red,house) \approx P(I) P(saw|I) P(the|saw) P(red|the) P(house|red) [10460170] |whereas in a trigram (n=3) language model, the approximation is P(I,saw,the,red,house) \approx P(I) P(saw|I) P(the|I,saw) P(red|saw,the) P(house|the,red) [10470010] |
Latent semantic analysis
[10470020] |'''Latent semantic analysis (LSA)''' is a technique in [[natural language processing]], in particular in [[vectorial semantics]], of analyzing relationships between a set of documents and the terms they contain by producing a set of concepts related to the documents and terms. [10470030] |LSA was patented in [[1988]] ([http://patft.uspto.gov/netacgi/nph-Parser?patentnumber=4839853 US Patent 4,839,853]) by [[Scott Deerwester]], [[Susan Dumais]], [[George Furnas]], [[Richard Harshman]], [[Thomas Landauer]], [[Karen Lochbaum]] and [[Lynn Streeter]]. [10470040] |In the context of its application to [[information retrieval]], it is sometimes called '''latent semantic indexing (LSI)'''. [10470050] |== Occurrence matrix == [10470060] |LSA can use a [[term-document matrix]] which describes the occurrences of terms in documents; it is a [[sparse matrix]] whose rows correspond to [[terminology|terms]] and whose columns correspond to documents, typically [[stemming|stemmed]] words that appear in the documents. [10470070] |A typical example of the weighting of the elements of the matrix is [[tf-idf]] (term frequency–inverse document frequency): the element of the matrix is proportional to the number of times the terms appear in each document, where rare terms are upweighted to reflect their relative importance. [10470080] |This matrix is also common to standard semantic models, though it is not necessarily explicitly expressed as a matrix, since the mathematical properties of matrices are not always used. [10470090] |LSA transforms the occurrence matrix into a relation between the terms and some ''concepts'', and a relation between those concepts and the documents. [10470100] |Thus the terms and documents are now indirectly related through the concepts. [10470110] |== Applications == [10470120] |The new concept space typically can be used to: [10470130] |* Compare the documents in the concept space ([[data clustering]], [[document classification]])...... [10470140] |* Find similar documents across languages, after analyzing a base set of translated documents ([[cross language retrieval]]). [10470150] |* Find relations between terms ([[synonymy]] and [[polysemy]]). [10470160] |* Given a query of terms, translate it into the concept space, and find matching documents ([[information retrieval]]). [10470170] |Synonymy and polysemy are fundamental problems in [[natural language processing]]: [10470180] |* Synonymy is the phenomenon where different words describe the same idea. [10470190] |Thus, a query in a search engine may fail to retrieve a relevant document that does not contain the words which appeared in the query. [10470200] |For example, a search for "doctors" may not return a document containing the word "physicians", even though the words have the same meaning. [10470210] |* Polysemy is the phenomenon where the same word has multiple meanings. [10470220] |So a search may retrieve irrelevant documents containing the desired words in the wrong meaning. [10470230] |For example, a botanist and a computer scientist looking for the word "tree" probably desire different sets of documents. [10470240] |== Rank lowering == [10470250] |After the construction of the occurrence matrix, LSA finds a low-[[rank (matrix theory)|rank]] approximation to the [[term-document matrix]]. [10470260] |There could be various reasons for these approximations: [10470270] |* The original term-document matrix is presumed too large for the computing resources; in this case, the approximated low rank matrix is interpreted as an ''approximation'' (a "least and necessary evil"). [10470280] |* The original term-document matrix is presumed ''noisy'': for example, anecdotal instances of terms are to be eliminated. [10470290] |From this point of view, the approximated matrix is interpreted as a ''de-noisified matrix'' (a better matrix than the original). [10470300] |* The original term-document matrix is presumed overly [[Sparse matrix|sparse]] relative to the "true" term-document matrix. [10470310] |That is, the original matrix lists only the words actually ''in'' each document, whereas we might be interested in all words ''related to'' each document--generally a much larger set due to [[synonymy]]. [10470320] |The consequence of the rank lowering is that some dimensions are combined and depend on more than one term: [10470330] |:: {(car), (truck), (flower)} --> {(1.3452 * car + 0.2828 * truck), (flower)} [10470340] |This mitigates synonymy, as the rank lowering is expected to merge the dimensions associated with terms that have similar meanings. [10470350] |It also mitigates polysemy, since components of polysemous words that point in the "right" direction are added to the components of words that share a similar meaning. [10470360] |Conversely, components that point in other directions tend to either simply cancel out, or, at worst, to be smaller than components in the directions corresponding to the intended sense. [10470370] |== Derivation == [10470380] |Let X be a matrix where element (i,j) describes the occurrence of term i in document j (this can be, for example, the frequency). [10470385] |X will look like this: [10470390] |: \begin{matrix} & \textbf{d}_j \\ & \downarrow \\ \textbf{t}_i^T \rightarrow & \begin{bmatrix} x_{1,1} & \dots & x_{1,n} \\ \vdots & \ddots & \vdots \\ x_{m,1} & \dots & x_{m,n} \\ \end{bmatrix} \end{matrix} [10470400] |Now a row in this matrix will be a vector corresponding to a term, giving its relation to each document: [10470410] |:\textbf{t}_i^T = \begin{bmatrix} x_{i,1} & \dots & x_{i,n} \end{bmatrix} [10470420] |Likewise, a column in this matrix will be a vector corresponding to a document, giving its relation to each term: [10470430] |:\textbf{d}_j = \begin{bmatrix} x_{1,j} \\ \vdots \\ x_{m,j} \end{bmatrix} [10470440] |Now the [[dot product]] \textbf{t}_i^T \textbf{t}_p between two term vectors gives the [[correlation]] between the terms over the documents. [10470450] |The [[matrix product]] X X^T contains all these dot products. [10470460] |Element (i,p) (which is equal to element (p,i)) contains the dot product \textbf{t}_i^T \textbf{t}_p ( = \textbf{t}_p^T \textbf{t}_i). [10470470] |Likewise, the matrix X^T X contains the dot products between all the document vectors, giving their correlation over the terms: \textbf{d}_j^T \textbf{d}_q = \textbf{d}_q^T \textbf{d}_j. [10470480] |Now assume that there exists a decomposition of X such that U and V are [[orthonormal matrix|orthonormal matrices]] and \Sigma is a [[diagonal matrix]]. [10470490] |This is called a [[singular value decomposition]] (SVD): [10470500] |: X = U \Sigma V^T [10470510] |The matrix products giving us the term and document correlations then become [10470520] |: \begin{matrix} X X^T &=& (U \Sigma V^T) (U \Sigma V^T)^T = (U \Sigma V^T) (V^{T^T} \Sigma^T U^T) = U \Sigma V^T V \Sigma^T U^T = U \Sigma \Sigma^T U^T \\ X^T X &=& (U \Sigma V^T)^T (U \Sigma V^T) = (V^{T^T} \Sigma^T U^T) (U \Sigma V^T) = V \Sigma U^T U \Sigma V^T = V \Sigma^T \Sigma V^T \end{matrix} [10470530] |Since \Sigma \Sigma^T and \Sigma^T \Sigma are diagonal we see that U must contain the [[eigenvector]]s of X X^T, while V must be the eigenvectors of X^T X. [10470540] |Both products have the same non-zero eigenvalues, given by the non-zero entries of \Sigma \Sigma^T, or equally, by the non-zero entries of \Sigma^T\Sigma. [10470550] |Now the decomposition looks like this: [10470560] |: \begin{matrix} & X & & & U & & \Sigma & & V^T \\ & (\textbf{d}_j) & & & & & & & (\hat \textbf{d}_j) \\ & \downarrow & & & & & & & \downarrow \\ (\textbf{t}_i^T) \rightarrow & \begin{bmatrix} x_{1,1} & \dots & x_{1,n} \\ \\ \vdots & \ddots & \vdots \\ \\ x_{m,1} & \dots & x_{m,n} \\ \end{bmatrix} & = & (\hat \textbf{t}_i^T) \rightarrow & \begin{bmatrix} \begin{bmatrix} \, \\ \, \\ \textbf{u}_1 \\ \, \\ \,\end{bmatrix} \dots \begin{bmatrix} \, \\ \, \\ \textbf{u}_l \\ \, \\ \, \end{bmatrix} \end{bmatrix} & \cdot & \begin{bmatrix} \sigma_1 & \dots & 0 \\ \vdots & \ddots & \vdots \\ 0 & \dots & \sigma_l \\ \end{bmatrix} & \cdot & \begin{bmatrix} \begin{bmatrix} & & \textbf{v}_1 & & \end{bmatrix} \\ \vdots \\ \begin{bmatrix} & & \textbf{v}_l & & \end{bmatrix} \end{bmatrix} \end{matrix} [10470570] |The values \sigma_1, \dots, \sigma_l are called the singular values, and u_1, \dots, u_l and v_1, \dots, v_l the left and right singular vectors. [10470580] |Notice how the only part of U that contributes to \textbf{t}_i is the i\textrm{'th} row. [10470590] |Let this row vector be called \hat \textrm{t}_i. [10470600] |Likewise, the only part of V^T that contributes to \textbf{d}_j is the j\textrm{'th} column, \hat \textrm{d}_j. [10470610] |These are ''not'' the eigenvectors, but ''depend'' on ''all'' the eigenvectors. [10470620] |It turns out that when you select the k largest singular values, and their corresponding singular vectors from U and V, you get the rank k approximation to X with the smallest error ([[Frobenius norm]]). [10470630] |The amazing thing about this approximation is that not only does it have a minimal error, but it translates the term and document vectors into a concept space. [10470640] |The vector \hat \textbf{t}_i then has k entries, each giving the occurrence of term i in one of the k concepts. [10470650] |Likewise, the vector \hat \textbf{d}_j gives the relation between document j and each concept. [10470660] |We write this approximation as [10470670] |:X_k = U_k \Sigma_k V_k^T [10470680] |You can now do the following: [10470690] |* See how related documents j and q are in the concept space by comparing the vectors \hat \textbf{d}_j and \hat \textbf{d}_q (typically by [[vector space model|cosine similarity]]). [10470700] |This gives you a clustering of the documents. [10470710] |* Comparing terms i and p by comparing the vectors \hat \textbf{t}_i and \hat \textbf{t}_p, giving you a clustering of the terms in the concept space. [10470720] |* Given a query, view this as a mini document, and compare it to your documents in the concept space. [10470730] |To do the latter, you must first translate your query into the concept space. [10470740] |It is then intuitive that you must use the same transformation that you use on your documents: [10470750] |:\textbf{d}_j = U_k \Sigma_k \hat \textbf{d}_j [10470760] |:\hat \textbf{d}_j = \Sigma_k^{-1} U_k^T \textbf{d}_j [10470770] |This means that if you have a query vector q, you must do the translation \hat \textbf{q} = \Sigma_k^{-1} U_k^T \textbf{q} before you compare it with the document vectors in the concept space. [10470780] |You can do the same for pseudo term vectors: [10470790] |:\textbf{t}_i^T = \hat \textbf{t}_i^T \Sigma_k V_k^T [10470800] |:\hat \textbf{t}_i^T = \textbf{t}_i^T V_k^{-T} \Sigma_k^{-1} = \textbf{t}_i^T V_k \Sigma_k^{-1} [10470810] |:\hat \textbf{t}_i = \Sigma_k^{-1} V_k^T \textbf{t}_i [10470820] |== Implementation == [10470830] |The [[Singular Value Decomposition|SVD]] is typically computed using large matrix methods (for example, [[Lanczos method]]s) but may also be computed incrementally and with greatly reduced resources via a [[neural network]]-like approach, which does not require the large, full-rank matrix to be held in memory ([http://www.dcs.shef.ac.uk/~genevieve/gorrell_webb.pdf Gorrell and Webb, 2005]). [10470840] |A fast, incremental, low-memory, large-matrix SVD algorithm has recently been developed ([http://www.merl.com/publications/TR2006-059/ Brand, 2006]). [10470850] |Unlike Gorrell and Webb's (2005) stochastic approximation, Brand's (2006) algorithm provides an exact solution. [10470860] |== Limitations == [10470870] |LSA has two drawbacks: [10470880] |* The resulting dimensions might be difficult to interpret. [10470890] |For instance, in [10470900] |:: {(car), (truck), (flower)} --> {(1.3452 * car + 0.2828 * truck), (flower)} [10470910] |:the (1.3452 * car + 0.2828 * truck) component could be interpreted as "vehicle". [10470920] |However, it is very likely that cases close to [10470930] |:: {(car), (bottle), (flower)} --> {(1.3452 * car + 0.2828 * bottle), (flower)} [10470940] |:will occur. [10470950] |This leads to results which can be justified on the mathematical level, but have no interpretable meaning in natural language. [10470960] |* The [[probabilistic model]] of LSA does not match observed data: LSA assumes that words and documents form a joint [[normal distribution|Gaussian]] model ([[ergodic hypothesis]]), while a [[Poisson distribution]] has been observed. [10470970] |Thus, a newer alternative is [[probabilistic latent semantic analysis]], based on a [[multinomial distribution|multinomial]] model, which is reported to give better results than standard LSA . [10480010] |
Linguistics
[10480020] |'''Linguistics''' is the [[science|scientific]] study of [[language]], encompassing a number of sub-fields. [10480030] |An important topical division is between the study of language structure ([[grammar]]) and the study of [[Meaning (linguistics)|meaning]] ([[semantics]]). [10480040] |Grammar encompasses [[morphology (linguistics)|morphology]] (the formation and composition of [[word]]s), [[syntax]] (the rules that determine how words combine into [[phrase]]s and [[Sentence (linguistics)|sentences]]) and [[phonology]] (the study of sound systems and abstract sound units). [10480050] |[[Phonetics]] is a related branch of linguistics concerned with the actual properties of speech sounds ([[phone]]s), non-speech sounds, and how they are produced and [[speech perception|perceived]]. [10480060] |Over the twentieth century, following the work of [[Noam Chomsky]], linguistics came to be dominated by the [[Generative grammar|Generativist school]], which is chiefly concerned with explaining how human beings [[language acquisition|acquire language]] and the biological constraints on this acquisition; generative theory is [[Language module|modularist]] in character. [10480070] |While this remains the dominant paradigm, other linguistic theories have increasingly gained in popularity — [[cognitive linguistics]] being a prominent example. [10480080] |There are many sub-fields in linguistics, which may or may not be dominated by a particular theoretical approach: [[evolutionary linguistics]], for example, attempts to account for the origins of language; [[historical linguistics]] explores language change; and [[sociolinguistics]] looks at the relation between linguistic variation and social structures. [10480090] |A variety of intellectual disciplines are relevant to the study of language. [10480100] |Although certain linguists have downplayed the relevance of some other fields, linguistics — like other sciences — is highly interdisciplinary and draws on work from such fields as [[psychology]], [[informatics]], [[computer science]], [[philosophy]], [[biology]], [[human anatomy]], [[neuroscience]], [[sociology]], [[anthropology]], and [[acoustics]]. [10480110] |==Names for the discipline== [10480120] |Before the twentieth century (the word is first attested 1716), the term "[[philology]]" was commonly used to refer to the science of language, which was then predominately historical in focus. [10480130] |Since [[Ferdinand de Saussure]]'s insistence on the importance of [[Synchronic analysis (linguistics)|synchronic analysis]], however, this focus has shifted and the term "philology" is now generally used for the "study of a language's grammar, history and literary tradition", especially in the [[USA]]., where it was never as popular as elsewhere in the sense "science of language". [10480140] |The term "linguistics" dates from 1847, although "linguist" in the sense a student of language" dates from 1641. [10480150] |It is now the usual academic term in English for the scientific study of language. [10480160] |==Fundamental concerns and divisions== [10480170] |Linguistics concerns itself with describing and explaining the nature of human language. [10480180] |Relevant to this are the questions of what is universal to language, how language can vary, and how human beings come to know languages. [10480190] |All humans (setting aside extremely pathological cases) achieve competence in whatever language is spoken (or signed, in the case of [[sign language|signed languages]]) around them when growing up, with apparently little need for explicit conscious instruction. [10480200] |While non-humans acquire their own communication systems, they do not acquire human language in this way (although many non-human animals can learn to respond to language, or can even be trained to use it to a degree). [10480210] |Therefore, linguists assume, the ability to acquire and use language is an innate, biologically-based potential of modern human beings, similar to the ability to walk. [10480220] |There is no consensus, however, as to the extent of this innate potential, or its domain-specificity (the degree to which such innate abilities are specific to language), with some theorists claiming that there is a very large set of highly abstract and specific binary settings coded into the human brain, while others claim that the ability to learn language is a product of general human cognition. [10480230] |It is, however, generally agreed that there are no strong ''genetic'' differences underlying the differences between languages: an individual will acquire whatever language(s) they are exposed to as a child, regardless of parentage or ethnic origin. [10480240] |Linguistic structures are pairings of meaning and form (which may consist of sound patterns, movements of the hand, written symbols, and so on); such pairings are known as [[Ferdinand de Saussure|Saussurean]] [[linguistic sign|signs]]. [10480250] |Linguists may specialize in some sub-area of linguistic structure, which can be arranged in the following terms, from form to meaning: [10480260] |* '''[[Phonetics]]''', the study of the physical properties of speech (or signed) production and perception [10480270] |* '''[[Phonology]]''', the study of sounds (adjusted appropriately for signed languages) as discrete, abstract elements in the speaker's mind that distinguish meaning [10480280] |* '''[[Morphology (linguistics)|Morphology]]''', the study of internal structures of [[word]]s and how they can be modified [10480290] |* '''[[Syntax]]''', the study of how words combine to form grammatical [[sentence]]s [10480300] |* '''[[Semantics]]''', the study of the meaning of words ([[lexical semantics]]) and fixed word combinations ([[phraseology]]), and how these combine to form the [[meaning]]s of sentences [10480310] |* '''[[Pragmatics]]''', the study of how [[utterance]]s are used (literally, figuratively, or otherwise) in [[speech acts|communicative acts]] [10480320] |* '''[[Discourse analysis]]''', the analysis of language use in [[texts]] (spoken, written, or signed) [10480330] |Many linguists would agree that these divisions overlap considerably, and the independent significance of each of these areas is not universally acknowledged. [10480340] |Regardless of any particular linguist's position, each area has core concepts that foster significant scholarly inquiry and research. [10480350] |Intersecting with these domains are fields arranged around the kind of external factors that are considered. [10480360] |For example [10480370] |* [[Linguistic typology]], the study of the common properties of diverse unrelated languages, properties that may, given sufficient attestation, be assumed to be innate to human language capacity. [10480380] |* [[Stylistics (linguistics)|Stylistics]], the study of linguistic factors that place a discourse in context. [10480390] |* [[Developmental linguistics]], the study of the development of linguistic ability in an individual, particularly [[Language acquisition|the acquisition of language]] in childhood. [10480400] |* [[Historical linguistics]] or Diachronic linguistics, the study of language change. [10480410] |* [[Language geography]], the study of the spatial patterns of languages. [10480420] |* [[Evolutionary linguistics]], the study of the origin and subsequent development of language. [10480430] |* [[Psycholinguistics]], the study of the cognitive processes and representations underlying language use. [10480440] |* [[Sociolinguistics]], the study of social patterns and norms of linguistic variability. [10480450] |* [[Clinical linguistics]], the application of linguistic theory to the area of [[Speech-Language Pathology]]. [10480460] |* [[Neurolinguistics]], the study of the brain networks that underlie grammar and communication. [10480470] |* [[Biolinguistics]], the study of natural as well as human-taught communication systems in animals compared to human language. [10480480] |* [[Computational linguistics]], the study of computational implementations of linguistic structures. [10480490] |* [[Applied linguistics]], the study of language related issues applied in everyday life, notably language. policies, planning, and education. [10480500] |[[Constructed language]] fits under Applied linguistics. [10480510] |The related discipline of [[semiotics]] investigates the relationship between signs and what they signify. [10480520] |From the perspective of semiotics, language can be seen as a sign or symbol, with the world as its representation. [10480530] |==Variation and universality== [10480540] |Much modern linguistic research, particularly within the [[paradigm]] of [[generative grammar]], has concerned itself with trying to account for differences between languages of the world. [10480550] |This has worked on the assumption that if human linguistic ability is narrowly constrained by human biology, then all languages must share certain fundamental properties. [10480560] |In [[generative grammar|generativist theory]], the collection of fundamental properties all languages share are referred to as [[universal grammar]] (UG). [10480570] |The specific characteristics of this universal grammar are a much debated topic. [10480580] |[[Linguistic typology|Typologists]] and non-generativist linguists usually refer simply to [[linguistic universal|language universals]], or ''universals of language''. [10480590] |Similarities between languages can have a number of different origins. [10480600] |In the simplest case, universal properties may be due to universal aspects of human experience. [10480610] |For example, all humans experience water, and all human languages have a word for water. [10480620] |Other similarities may be due to common descent: the [[Latin language]] spoken by the [[Ancient Rome|Ancient Romans]] developed into Spanish in Spain and Italian in Italy; similarities between Spanish and Italian are thus in many cases due to both being descended from Latin. [10480630] |In other cases, [[Language contact|contact between languages]] — particularly where many speakers are bilingual — can lead to much borrowing of structures, as well as words. [10480640] |Similarity may also, of course, be due to coincidence. [10480650] |English ''much'' and Spanish ''mucho'' are not descended from the same form or borrowed from one language to the other; nor is the similarity due to innate linguistic knowledge (see [[False cognate]]). [10480660] |Arguments in favor of language universals have also come from documented cases of [[sign language]]s (such as [[Al-Sayyid Bedouin Sign Language]]) developing in communities of congenitally deaf people, independently of spoken language. [10480670] |The properties of these sign languages conform generally to many of the properties of spoken languages. [10480680] |Other known and suspected sign language [[language isolate|isolates]] include [[Kata Kolok]], [[Nicaraguan Sign Language]], and [[Providence Island Sign Language]]. [10480690] |== Structures == [10480700] |It has been perceived that languages tend to be organized around [[grammatical categories]] such as noun and verb, [[nominative case|nominative]] and [[accusative case|accusative]], or present and past, though, importantly, not exclusively so. [10480710] |The grammar of a language is organized around such fundamental categories, though many languages express the relationships between words and syntax in other discrete ways (cf. some Bantu languages for noun/verb relations, ergative/absolutive systems for case relations, several Native American languages for tense/aspect relations). [10480720] |In addition to making substantial use of discrete categories, language has the important property that it organizes elements into recursive structures; this allows, for example, a noun phrase to contain another noun phrase (as in “the chimpanzee’s lips”) or a clause to contain a clause (as in “I think that it’s raining”). [10480730] |Though recursion in grammar was implicitly recognized much earlier (for example by [[Otto Jespersen|Jespersen]]), the importance of this aspect of language became more popular after the 1957 publication of [[Noam Chomsky]]’s book “[[Syntactic Structures]]”, - that presented a formal grammar of a fragment of English. [10480740] |Prior to this, the most detailed descriptions of linguistic systems were of phonological or morphological systems. [10480750] |Chomsky used a [[context-free grammar]] augmented with transformations. [10480760] |Since then, following the trend of Chomskyan linguistics, context-free grammars have been written for substantial fragments of various languages (for example [[Generalised phrase structure grammar|GPSG]], for English), but it has been demonstrated that human languages include cross-serial dependencies, which cannot be handled adequately by context-free grammars. [10480770] |==Some selected sub-fields == [10480780] |'''Diachronic linguistics''' [10480790] |Studying languages at a particular point in time (usually the present) is "synchronic", while diachronic linguistics examines how language changes through time, sometimes over centuries. [10480800] |It enjoys both a rich history and a strong theoretical foundation for the study of [[language change]]. [10480810] |In universities in the United States, the non-historic perspective is often out of fashion. [10480820] |The shift in focus to a non-historic perspective started with [[Ferdinand de Saussure|Saussure]] and became pre-dominant with [[Noam Chomsky]]. [10480830] |Explicitly historical perspectives include [[historical-comparative linguistics]] and [[etymology]]. [10480840] |'''Contextual linguistics''' [10480850] |Contextual linguistics may include the study of linguistics in interaction with other academic disciplines. [10480860] |The interdisciplinary areas of linguistics consider how language interacts with the rest of the world. [10480870] |[[Sociolinguistics]], [[anthropological linguistics]], and [[linguistic anthropology]] are seen as areas that bridge the gap between linguistics and society as a whole. [10480880] |[[Psycholinguistics]] and [[neurolinguistics]] relate linguistics to the [[medical science]]s. [10480890] |Other cross-disciplinary areas of linguistics include [[evolutionary linguistics]], [[computational linguistics]] and [[cognitive science]]. [10480900] |'''Applied linguistics''' [10480910] |Linguists are largely concerned with finding and [[descriptive linguistics|describing]] the generalities and varieties both within particular languages and among all language. [10480920] |[[Applied linguistics]] takes the result of those findings and “applies” them to other areas. [10480930] |Often “applied linguistics” refers to the use of linguistic research in language teaching, but results of linguistic research are used in many other areas, as well. [10480940] |Today in the age of information technology, many areas of applied linguistics attempt to involve the use of computers. [10480950] |[[Speech synthesis]] and [[speech recognition]] use phonetic and phonemic knowledge to provide voice interfaces to computers. [10480960] |Applications of [[computational linguistics]] in [[machine translation]], [[computer-assisted translation]], and [[natural language processing]] are areas of applied linguistics which have come to the forefront. [10480970] |Their influence has had an effect on theories of syntax and semantics, as modeling syntactic and semantic theories on computers constraints. [10480980] |==Description and prescription== [10480990] |''Main articles: [[Descriptive linguistics]], [[Linguistic prescription]]'' [10481000] |Linguistics is '''descriptive'''; linguists describe and explain features of language without making subjective judgments on whether a particular feature is "right" or "wrong". [10481010] |This is analogous to practice in other sciences: a [[zoologist]] studies the animal kingdom without making subjective judgments on whether a particular animal is better or worse than another. [10481020] |'''Prescription''', on the other hand, is an attempt to promote particular linguistic usages over others, often favouring a particular dialect or "[[acrolect]]". [10481030] |This may have the aim of establishing a [[Standard language|linguistic standard]], which can aid communication over large geographical areas. [10481040] |It may also, however, be an attempt by speakers of one language or dialect to exert influence over speakers of other languages or dialects (see [[Linguistic imperialism]]). [10481050] |An extreme version of prescriptivism can be found among [[censorship|censors]], who attempt to eradicate words and structures which they consider to be destructive to society. [10481060] |== Speech and writing == [10481070] |Most contemporary linguists work under the assumption that [[spoken language|spoken]] (or signed) language is more fundamental than [[written language]]. [10481080] |This is because: [10481090] |* Speech appears to be a human "universal", whereas there have been many [[culture]]s and speech communities that lack written communication; [10481100] |* Speech evolved before human beings discovered writing; [10481110] |* People learn to speak and process spoken languages more easily and much earlier than writing; [10481120] |Linguists nonetheless agree that the study of written language can be worthwhile and valuable. [10481130] |For research that relies on [[corpus linguistics]] and [[computational linguistics]], written language is often much more convenient for processing large amounts of linguistic data. [10481140] |Large corpora of spoken language are difficult to create and hard to find, and are typically [[transcription (linguistics)|transcribed]] and written. [10481150] |Additionally, linguists have turned to text-based discourse occurring in various formats of [[computer-mediated communication]] as a viable site for linguistic inquiry. [10481160] |The study of [[writing systems]] themselves is in any case considered a branch of linguistics. [10481170] |== History == [10481180] |Some of the earliest linguistic activities can be recalled from [[Iron Age India]] with the analysis of [[Sanskrit]]. [10481190] |The [[Pratishakhya]]s (from ca. the 8th century BC) constitute as it were a proto-linguistic ''ad hoc'' collection of observations about mutations to a given [[corpus linguistics|corpus]] particular to a given [[Shakha|Vedic school]]. [10481200] |Systematic study of these texts gives rise to the [[Vedanga]] discipline of [[Vyakarana]], the earliest surviving account of which is the work of {{IAST|[[Pānini]]}} (c. 520 – 460 BC), who, however, looks back on what are probably several generations of grammarians, whose opinions he occasionally refers to. [10481210] |{{IAST|Pānini}} formulates close to 4,000 rules which together form a compact [[generative grammar]] of Sanskrit. [10481220] |Inherent in his analytic approach are the concepts of the [[phoneme]], the [[morpheme]] and the [[root]]. [10481230] |Due to its focus on brevity, his grammar has a highly unintuitive structure, reminiscent of contemporary "machine language" (as opposed to "human readable" programming languages). [10481240] |Indian linguistics maintained a high level for several centuries; [[Mahābhāṣya|Patanjali]] in the 2nd century BC still actively criticizes Panini. [10481250] |In the later centuries BC, however, Panini's grammar came to be seen as prescriptive, and commentators came to be fully dependent on it. [10481260] |[[Bhartrihari]] (c. 450 – 510) theorized the act of speech as being made up of four stages: first, conceptualization of an idea, second, its verbalization and sequencing (articulation) and third, delivery of speech into atmospheric air, the interpretation of speech by the listener, the interpreter. [10481270] |In the [[Middle East]], the [[Persian language|Persian]] linguist [[Sibawayh]] made a detailed and professional description of [[Arabic language|Arabic]] in 760, in his monumental work, ''Al-kitab fi al-nahw'' (الكتاب في النحو, ''The Book on Grammar''), bringing many linguistic aspects of language to light. [10481280] |In his book he distinguished [[phonetics]] from [[phonology]]. [10481290] |Western linguistics begins in Classical Antiquity with grammatical speculation such as [[Plato]]'s ''[[Cratylus]]''. [10481300] |[[William Jones (philologist)|Sir William Jones]] noted that [[Sanskrit]] shared many common features with classical [[Latin]] and [[Ancient Greek|Greek]], notably verb roots and grammatical structures, such as the [[case system]]. [10481310] |This led to the theory that all languages sprung from a common source and to the discovery of the [[Indo-European]] [[language family]]. [10481320] |He began the study of [[comparative linguistics]], which would uncover more language families and branches. [10481330] |Some early-19th-century linguists were [[Jakob Grimm]], who devised a principle of consonantal shifts in pronunciation – known as [[Grimm's Law]] – in 1822; [[Karl Verner]], who formulated [[Verner's Law]]; [[August Schleicher]], who created the "Stammbaumtheorie" ("family tree"); and [[Johannes Schmidt (linguist)|Johannes Schmidt]], who developed the "Wellentheorie" ("wave model") in 1872. [10481340] |[[Ferdinand de Saussure]] was the founder of modern structural linguistics. [10481350] |[[Edward Sapir]], a leader in American structural linguistics, was one of the first who explored the relations between language studies and anthropology. [10481360] |His methodology had strong influence on all his successors. [10481370] |[[Noam Chomsky|Noam Chomsky's]] formal model of language, [[transformational-generative grammar]], developed under the influence of his teacher [[Zellig Harris]], who was in turn strongly influenced by [[Leonard Bloomfield]], has been the dominant model since the 1960s. [10481380] |[[Noam Chomsky]] remains a pop-linguistic figure. [10481390] |Linguists (working in frameworks such as [[Head-Driven Phrase Structure Grammar]] (HPSG) or [[Lexical Functional Grammar]] (LFG)) are increasingly seen to stress the importance of formalization and formal rigor in linguistic description, and may distance themselves somewhat from Chomsky's more recent work (the "Minimalist" program for [[Transformational grammar]]), connecting more closely to his earlier works. [10481400] |Other linguists working in [[Optimality Theory]] state generalizations in terms of violable constraints that interact with each other, and abandon the traditional rule-based formalism first pioneered by early work in generativist linguistics. [10481410] |Functionalist linguists working in [[functional grammar]] and [[Cognitive Linguistics]] tend to stress the non-autonomy of linguistic knowledge and the non-universality of linguistic structures, thus differing significantly from the Chomskyan school. [10481420] |They reject Chomskyan intuitive introspection as a scientific method, relying instead on typological evidence.