Sapir Whorf – Page 3 – Gravity Drift

In the Future, We Will Photograph Everything and Look at Nothing – The New Yorker

In the Future, We Will Photograph Everything and Look at Nothing – The New Yorker

My guess is that it wants to kill the software, but it doesn’t want the P.R. nightmare that would follow. Remember the outcry over its decision to shut down its tool for R.S.S. feeds, Google Reader? Nik loyalists are even more rabid.

…

“The definition of photography is changing, too, and becoming more of a language,” the Brooklyn-based artist and professional photographer Joshua Allen Harris told me. “We’re attaching imagery to tweets or text messages, almost like a period at the end of a sentence. It’s enhancing our communication in a whole new way.”

…

In other words, “the term ‘photographer’ is changing,” he said. As a result, photos are less markers of memories than they are Web-browser bookmarks for our lives.

Learning machine learning — Benedict Evans

Learning machine learning — Benedict Evans

As has happened with many technologies before, AI is bursting out of universities and research labs and turning into product, often led by those researchers as they turn entrepreneur and create companies. Lots of things started working, the two most obvious illustrations being the progress for ImageNet and of course AlphaGo. And in parallel, many of these capabilities are being abstracted – they’re being turned into open source frameworks that people can pick up (almost) off the shelf. So, one could argue that AI is undergoing a take-off in practicality and scale that’s going to transform tech just as, in different ways, packets, mobile, or open source did.

This also means, though, that there’s a sort of tech Tourettes’ around – people shout ‘AI!’ or ‘MACHINE LEARNING!’ where people once shouted ‘OPEN!’ or ‘PACKETS!’. This stuff is changing the world, yes, but we need context and understanding. ‘AI’, really, is lots of different things, at lots of different stages. Have you built HAL 9000 or have you written a thousand IF statements?

Back in 2000 and 2001 (and ever since) I spent a lot of my time reading PDFs about mobile – specifications and engineers’ conference presentations and technical papers – around all the layers of UMTS, WCDMA, J2ME, MEXE, WML, iAppli, cHTML, FeliCa, ISDB-T and many other things besides, some of which ended up mattering and some of which didn’t. (My long-dormant del.icio.us account has plenty of examples of both).

The same process will happen now with AI within a lot of the tech industry, and indeed all the broader industries that are affected by it. AI brings a blizzard of highly specialist terms and ideas, layered upon each other, that previously only really mattered to people in the field (mostly, in universities and research labs) and people who took a personal interest, and now, suddenly, this starts affecting everyone in technology. So, everyone who hasn’t been following AI for the last decade has to catch up.

IBM 704 – Wikipedia, the free encyclopedia

IBM 704 – Wikipedia, the free encyclopedia

The programming languages FORTRAN^[5] and LISP^[6] were first developed for the 704.

MUSIC, the first computer music program, was developed on the IBM 704 by Max Mathews.

In 1962 physicist John Larry Kelly, Jr created one of the most famous moments in the history ofBell Labs by using an IBM 704 computer to synthesize speech. Kelly’s voice recorder synthesizervocoder recreated the song Daisy Bell, with musical accompaniment from Max Mathews. Arthur C. Clarke was coincidentally visiting friend and colleague John Pierce at the Bell Labs Murray Hill facility at the time of this speech synthesis demonstration, and Clarke was so impressed that six years later he used it in the climactic scene of his novel and screenplay for 2001: A Space Odyssey,^[7] where the HAL 9000 computer sings the same song.^[8]^{[contradictory]}

Edward O. Thorp, a math instructor at MIT, used the IBM 704 as a research tool to investigate the probabilities of winning while developing his blackjack gaming theory.^[9]^[10] He used FORTRAN to formulate the equations of his research model.

The IBM 704 was used as the official tracker for the Smithsonian Astrophysical Observatory Operation Moonwatch in the fall of 1957. See The M.I.T. Computation Center and Operation Moonwatch. IBM provided four staff scientists to aid Smithsonian Astrophysical Observatoryscientists and mathematicians in the calculation of satellite orbits: Dr. Giampiero Rossoni, Dr. John Greenstadt, Thomas Apple and Richard Hatch.

Intro To Computational Linguistics

Intro To Computational Linguistics

Machine Translation

At the end of the 1950s, researchers in the United States, Russia, and Western Europe were confident that high-quality machine translation (MT) of scientific and technical documents would be possible within a very few years. After the promise had remained unrealized for a decade, the National Academy of Sciences of the United States published the much cited but little read report of its Automatic Language Processing Advisory Committee. The ALPAC Report recommended

that the resources that were being expended on MT as a solution to immediate practical problems should be redirected towards more fundamental questions of language processing that would have to be answered before any translation machine could be built. The number of laboratories working in the field was sharply reduced all over the world, and few of them were able to obtain funding for more long-range research programs in what then came to be known as computational linguistics.

There was a resurgence of interest in machine translation in the 1980s and, although the approaches adopted differed little from those of the 1960s, many of the efforts, notably in Japan, were rapidly deemed successful. This seems to have had less to do with advances in linguistics and software technology or with the greater size and speed of computers than with a better appreciation of special situations where ingenuity might make a limited success of rudimentary MT. The most conspicuous example was the METEO system, developed at the University of Montreal, which has long provided the French translations of the weather reports used by airlines, shipping companies, and others. Some manufacturers of machinery have found it possible to translate maintenance manuals used within their organizations (not by their customers) largely automatically by having the technical writers use only certain words and only in carefully prescribed ways.

Why Machine Translation Is Hard

Many factors contribute to the difficulty of machine translation, including words with multiple meanings, sentences with multiple grammatical structures, uncertainty about what a pronoun refers to, and other problems of grammar. But two common misunderstandings make translation seem altogether simpler than it is. First, translation is not primarily a linguistic operation, and second, translation is not an operation that preserves meaning.

There is a famous example that makes the first point well. Consider the sentence:

The police refused the students a permit because they feared violence.

Suppose that it is to be translated into a language like French in which the word for ‘police’ is feminine. Presumably the pronoun that translates ‘they’ will also have to be feminine. Now replace the word ‘feared’ with ‘advocated’. Now, suddenly, it seems that ‘they’ refers to the students and not to the police and, if the word for students is masculine, it will therefore require a different translation. The knowledge required to reach these conclusions has nothing linguistic about it. It has to do with everyday facts about students, police, violence, and the kinds of relationships we have seen these things enter into.

The second point is, of course, closely related. Consider the following question, stated in French: Ou voulez-vous que je me mette? It means literally, “Where do you want me to put myself?” but it is a very natural translation for a whole family of English questions of the form “Where do you want me to sit/stand/sign my name/park/tie up my boat?” In most situations, the English “Where do you want me?” would be acceptable, but it is natural and routine to add or delete information in order to produce a fluent translation. Sometimes it cannot be avoided because there are languages like French in which pronouns must show number and gender, Japanese where pronouns are often omitted altogether, Russian where there are no articles, Chinese where nouns do not differentiate singular and plural nor verbs present and past, and German where flexibility of the word order can leave uncertainties about what is the subject and what is the object.

The Structure of Machine Translation Systems

While there have been many variants, most MT systems, and certainly those that have found practical application, have parts that can be named for the chapters in a linguistic text book. They have lexical, morphological, syntactic, and possibly semantic components, one for each of the two languages, for treating basic words, complex words, sentences and meanings. Each feeds into the next until a very abstract representation of the sentence is produced by the last one in the chain.

There is also a ‘transfer’ component, the only one that is specialized for a particular pair of languages, which converts the most abstract source representation that can be achieved into a corresponding abstract target representation. The target sentence is produced from this essentially by reversing the analysis process. Some systems make use of a so-called ‘interlingua’ or intermediate language, in which case the transfer stage is divided into two steps, one translating a source sentence into the interlingua and the other translating the result of this into an abstract representation in the target language.

One other problem for computers is dealing with metaphor. Metaphors are a common part of language and occur frequently in the computer world:

How can I kill the program?
How do I get back into dos?
My car drinks gasoline

One approach treats metaphor as a failure of regular semantic rules

Compute the normal meaning of get into—dos violates its selection restrictions

dos isn’t an enclosure so the interpreter fails

Next have to search for an unconventional meaning for get into and recompute its meaning

If an unconventional meaning isn’t available, you can try using context, or world knowledge

Statistical procedures aren’t likely to generate interpretations for new metaphors.

Interpretation routines might result in overgeneralizations:

How can I kill dos? —> *How can I give birth to dos?

*How can I slay dos?

Mary caught a cold from John —> *John threw Mary his cold.

Catching a cold in unintentional (as opposed to catching a thief)

Getting Started

The best way to learn about language processing is to write your own computer programs. To do this, users will need access to a computer that can display information on the internet. Anyone with an email account on a personal computer has this type of access. The exercises in this class are written for the Perl programming language. This language is widely available on mainframe computers, and allows users to manipulate strings of text with a modicum of ease. In order to use Perl on a mainframe computer, however, the reader will have to access the computer directly via a terminal emulation program.

The only other item that you will need for Perl programming is a text editor. Text editors provide a means of writing the commands that make up a Perl program. Mainframe computers typically have a program that allows users to write text files. You can also use these programs to write a Perl program. The University of Kansas mainframe uses the Pico and vi editors. Once you have assembled the basic tools for creating Perl programs you are ready to begin language processing.

Intro To Computational Linguistics

Intro To Computational Linguistics

The image of humans conversing with their computers is both a thoroughly accepted cliche of science fiction and the ultimate goal of computer programming, and yet, the year 2001 has come and gone without the appearance of anything like the HAL 9000 talking computer featured in the movie 2001: A Space Odyssey.

Computational linguists attempt to use computers to process human languages. The field of computational linguistics has two general aims:

The technological. To enable computers to analyze and process natural language.
The psychological. To model human language processing on computers.

From the technological perspective, natural language applications include:

Speech recognition. Today, many personal computers include speech recognition software.
Natural language interfaces to software. For example, demonstration systems have been built that let a user ask for flight information.

Examples:

chatterbots, e.g., Alice

natural language understanding, e.g., a perl parser

Document retrieval and information extraction from written text. For example, a computer system could scan newspaper articles, looking for information about events of a particular type and enter the information into a database.

Examples:

web searches, e.g., google.

course information and enrollment, e.g., KU, Linguistics.

Machine translation. Computers offer the promise of quick translations between languages.

Examples:

machine translation, e.g., SDL International

The rapid growth of the Internet/WWW and the emergence of the information society poses exciting new challenges to computational linguistics. Although the new media combine text, graphics, sound and movies, the whole wealth of multimedia information can only be structured, indexed and navigated through language. For browsing, navigating, filtering and processing the information on the web, we need language technology. The increasing multilingual nature of the web constitutes an additional challenge for language technology. The multilingual web can only be mastered with the help of multilingual tools for indexing and navigating.

Computational linguists adopting the psychological perspective hypothesize that at some abstract level, the brain is a kind of biological computer, and that an adequate answer to how people understand and generate language must be in terms formal and precise enough to be modeled by a computer.

Linguists Not Exactly Wow About Facebook’s New Reactions | WIRED

Linguists Not Exactly Wow About Facebook’s New Reactions | WIRED

WHEN MY 4-MONTH-OLD son is angry he turns bright red. When he finds something funny, he makes an alarming gurgling sound. When something surprises him, he says “Ah!”

You know: Like Facebook.

The introduction of Reactions, a set of five new “graphicons” with assigned textual meanings, probably isn’t supposed to be infantilizing. The social network just wants people to do more than “Like” someone else’s post. The new kids: Love, Sad, Angry, Wow, and Haha.

What do those words have in common? Not a lot, actually. To a grammar purist, that’s annoying. “These words are in radically different categories,” says Geoff Pullum, a linguist at the University of Edinburgh and contributor to the blog Language Log. “It looks like syntax is being thrown out the window here and being replaced by grunts like animals would make.”

Syntax, as you might remember, is the organization of words into sentences. By way of counter-example, syntactic conventions are what Internet meme languages like Dogespeak or Lolcats abuse. When you are sad because Monday, you are contravening the syntax of standard English. Much disappoint.

The Reaction words, though, have different syntactic uses. “Love” is either a noun or verb, depending on how you read it; “Sad” and “Angry” are adjectives; and “Wow” is an interjection, expressing astonishment. Pullum considers “Haha” to also be an interjection, expressing amusement, but Susan Herring, a linguist at Indiana University who studies language on social media, sees it as a non-speech sound.
Pullum and Herring agree, though, that the syntax of the new Facebook Reactions makes no sense. When Facebook asks you to respond to a status with that set of six words, it’s actually asking your brain to do something that’s slightly complicated: to fill in an implied sentence, or to “predicate” it. Programmatic linguists call this “inferencing.” The problem is, because these words are not the same category of speech, they require different predicates.

If you click “Love,” your brain must autocomplete the implied phrase “I love this.” Fine; just like “Like.” So far so good. But things get weirder with the adjectives. If you choose “Sad” or “Angry,” it’s not “I sad this” or “I angry this.” It’s “This makes me angry,” or “This makes me sad.” Makes sense! But the mental gymnastics of tweaking this supplied context aren’t easy.

For “Wow” and “Haha,” the problem is different. Both actually stand on their own outside of a sentence, so your brain doesn’t need to infer any predicate at all. Which is nice! But also inconsistent!

If those inconsistencies bother you, you may in fact have a disorder called “grammar purism.” Sufferers of GP have been known to correct mistakes on dinner menus and chew their cheeks in an effort not to correct their friend who always says “I have drank way too much tonight!” GP has no cure, but some sufferers find poetry or Winston Churchill quotes soothing.

“It’s a little bit perturbing that they are not the same parts of speech,” Herring says. But she doesn’t just talk about talking; she does something about it. As a thought experiment, Herring tried to rationalize the Reactions.

First she tried to make them all verbs. It didn’t work. You can say, “I love,” or “I laugh,” but as soon as you get to “I anger,” you’re doomed, because in that construction anger takes an object—“I anger the cat (by never letting it catch the laser pointer).”

Next Herring tried adjectives, where the predicate is “I am.” It was just as bad. “I’m sad” and “I’m angry,” are good, but for Love you’d need to say “I’m pleased” or “I’m delighted,” and that’s not the same emotion, really, at all. Not to mention how stilted “I’m amused” or “I’m surprised” would be for Wow and Haha. Nouns work better, and are reminiscent of that Internet tradition of spelling out the actions that emoji or emoticons are describing. Love could stay the same, but Sad would become Frown, Angry would become Scowl, Haha would become Laugh, Wow would become, perhaps, Gasp.

This gets closer to what Pullum says is the true nature of Facebook Reactions. “The happy face is like a squeal of delight; the sad face is like a sort of ‘humph’ of displeasure; the ‘wow’ face is like a widening of the eyes and opening of the mouth; the ‘haha’ is like giggling,” he says. “The emoji are all really just the equivalent of noises or gestures for directly expressing internal states. What is not being called upon here is the grammar and meaning that differentiate us humans from the other animals.”

None of this would matter to GP sufferers if Facebook hadn’t assigned each reaction a textual meaning. Unlike regular emoji and emoticons, which are purely graphical, Facebook chose to label each Reaction with a word, eliminating the ambiguity that makes emoji so great. This way, you don’t wonder if, say, the face with the open mouth is expressing fear or shock. “Once they decide to provide text, they back themselves into a corner, syntactically,” Herring says.

Gnolia – Wikipedia, the free encyclopedia

Gnolia – Wikipedia, the free encyclopedia

Gnolia, named Ma.gnolia until 2009, was a social bookmarking web site with an emphasis on design, social features, and open standards. It is now perhaps most notable for losing members’ bookmarks in a widely reported^[2]^[3]^[4]^[5]^[6] data loss incident in January 2009. It relaunched as a smaller service several months later and was ultimately shut down at the end of 2010.

Users could rate bookmarks and mark bookmarks as private. Unlike its main competitor^[7] Delicious, Ma.gnolia stored snapshots of bookmarked web pages. One feature that distinguished it from other similar web sites was the group feature, which allowed several users to share a common collection of bookmarks, managed by a selected number of group managers.

The design of the web site allowed for integration of the service into other applications via both a REST API and an API similar to the Delicious API.