Philip Resnik
|
News flash (April 26, 2012):Very happy to report that a new project in computational political science has received funding from the National Science Foundation. This collaboration with computer scientist
Noah Smith and political scientists
Amber Boydstun and
Justin Gross
will develop new computational modeling methods, grounded in data-driven computational linguistics, aimed at improving the scientific understanding of how issues are framed by political elites, the media, and the public.
News flash (April 23-24, 2012):I participated in a fascinating workshop at the National Institutes of Health on Natural Language Processing: State of the Art, Future Directions and Applications for Enhancing Clinical Decision-Making. I'm posting my slides here and there should be a full videocast of the workshop available at NIH soon. News flash (MEDIA, April 17, 2012): I've just discovered that an article in HealthLeaders Media entitled Are EMRs Killing the Clinical Narrative? covered my SXSWi talk (slides, audio) and it appears to have stirred up some interesting discussion. Which is great, since that's exactly what it was designed for. News flash (April 10, 2012): CodeRyte, a company I helped start up and still advise, has been acquired by 3M. CodeRyte is a leading provider of natural language processing solutions in healthcare. News flash (March 27, 2012): I've been invited to attend TEDMED 2012 in DC April 10-13. From the list of delegates and speakers this looks like it should be fascinating. [Update: It was indeed fascinating, wacky, and intellectually stimulating.] News flash (March 24, 2012): I gave an invited plenary lecture at the 2012 American Association for Applied Linguistics conference, entitled The Linguistics of Spin: A Computational Linguist's Forays into Social Science. During the talk I used myself as a guinea pig for the React Labs live polling app that I'm developing; results can be found here. News flash (March 19, 2012): Slides for my talk at South By Southwest Interactive (SXSWi) on Language Technology, Electronic Health Records, and the Clinical Narrative are now available here, with audio here. News flash (MEDIA, February 11, 2012): I was really pleased to be included among those quoted in discussions by the Wall Street Journal's "Numbers Guy", Carl Bialik, about mining Twitter for public opinion, including both the print column and the accompanying blog post. News flash (MEDIA, January 31, 2012): I had great fun guesting on the Kojo Nnamdi show on WAMU 88.5 in Washington DC, talking about New Frontiers in Political Polling: Social Media and "Sentiment Analysis". We discussed computational analysis of social media in the context of political campaigns, which was also the topic of a recent posting I did on Language Log called #CompuPolitics; we also briefly discussed the React Labs project, in which collaborators and I are developing a smartphone app for large scale, real-time collection of people's responses during live events like political debates. |
Machine translation. My recent work has largely been focused on machine translation and multilingual natural language processing, exploiting parallel corpora and linguistically informed modeling in statistical machine translation and in multilingual natural language processing more generally (with a focus on Chinese and Arabic, as well as other less-studied languages). As part of this effort, my postdoc David Chiang (now at USC/ISI) developed Hiero, the first syntax-based system to demonstrate performance comparable to then state-of-the-art statistical phrase-based MT systems (see 2005 NIST MT Evaluation results). I have worked with a number of students to further improve hierarchical phrase-based translation, and some innovations include the introduction of lattice decoding (useful in translation of speech recognition output and also for text translation of morphologically complex languages), development of efficient algorithms for using suffix array representations in hierarchical decoding, use of English-to-English translation to create artificial reference translations for use in parameter tuning, the introduction of soft syntactic constraints based on source language structure, and exploitation of lattices (and soon, forests) to represent source language paraphrase and syntactically driven reordering alternatives.Crowdsourcing and translation. Connected with my machine translation research, Ben Bederson and I are working on an ambitious attempt to achieve low cost, high quality translation by taking advantage of monolingual human participants in a computer-assisted translation protocol, in a project we call "Translation as a Collaborative Process". We're blending ideas from machine translation, human computer-interfaces, and distributed human computation ("crowdsourcing"), and tackling the real-world problem of translating books in the International Children's Digital Library. We received a 2009 Google Research Award sponsoring this work, as well as funding from NSF. In September 2009, Ben gave a Google tech talk about the project which is available on YouTube. Ben and I now have a follow-up Google Research Award in which we're collaborating with Chris Callison-Burch to bring his crowdsourcing work and ours together in a framework we're calling "Translate the World".
Computational social science. I have also been doing work on sentiment analysis and related topics such as persuasion, framing, and "spin", with a particular interest in the connections among lexical semantics, surface linguistic expression, and underlying internal state. One area in which I'm excited about starting to apply these ideas is computational political science. For example, why does my son say "My toy broke" instead of "I broke my toy"? He's using syntax to package up the statement about what happened in a way that de-emphasizes semantic properties such as causation, volition, and change-of-state. (This is an example of using syntax for "spin", just the same way that Ronald Reagan did in 1987 when he sidestepped attributing responsibility for the Iran-contra scandal; remember "Mistakes were made"? Precocious child.) My student Stephan Greene did a fascinating dissertation on this topic, and for a conference-paper-length description see our 2009 NAACL paper. Current topics of investigation include modeling syntax/semantics/sentiment connections in a Bayesian framework, bootstrapping multilingual sentiment analysis capabilities, and working with political scientists to model agenda setting and framing in political discourse. I've also been working with political scientist collaborators on the React Labs project, a smartphone app for large scale, real-time collection of people's responses during live events like political debates. Outside academia, I do real-world sentiment analysis as Lead Scientist with Converseon Inc., a leading social media firm.
Clinical informatics. Since about 1999 I've been involved in natural language processing for clinical documentation. I helped start up CodeRyte, Inc., which is now the nation's fastest growing provider of NLP solutions in healthcare (see, e.g., Deloitte's Technology Fast 500 and the Inc. 5000 listings.) I developed pieces of the core technology, helped build an excellent language technology team, and I currently advise the company on technology development and strategic direction. I've also presented a tutorial on NLP and computer assisted coding at the convention of the American Health Information Management Association (AHIMA) and I have serve as one of the co-chairs of AHIMA's steering committee on computer assisted coding. Much to my surprise, I was listed at #82 on the Future Health 100, a list of "the most creative and influential innovators working in healthcare today" at healthspottr.com.
Computational psycholinguistics. During the next several years, I hope to re-engage more fully with my interests in computational psycholinguistics. I'm particularly interested in the possibility that ideas from (statistical) information theory may have a useful role to play in explaining why language works the way it does. (This is an idea I first began exploring in my dissertation [ps, pdf], back in 1993, and in recent years a variety of people like John Hale, Roger Levy, and Florian Jaeger, among others, have done very interesting work in the same spirit.) I'm also interested in using Bayesian modeling as a way to bring linguists here with cognitive modeling interests together with computational linguists focusing on applications. Momentum for that around here has already started building with the recent arrival of Naomi Feldman in our Linguistics Department.
Empirical linguistics. I'm quite interested in promoting the use of naturally occurring data as evidence in linguistics research. , I led the development of the Linguist's Search Engine, a tool designed to make it easier for linguists to search naturally occurring data using syntactic and lexical criteria. This tool was intended to make it easier for more linguists to go beyond the exclusive use of introspective judgments as empirical evidence, which can lead to useful and interesting results. In follow-on work with the Center for the Advanced Study of Language (CASL), we ported the LSE to Chinese, and the LSE code is available under an open source license. (Aaron Elkiss was the LSE's chief architect, implementor, and guru. I kept it running for a number of years after he graduated, but eventually retired it. Anyone interested in resurrecting it: the source code is available.)
See my on-line list of publications for links to papers on the above research topics and more.
Professional History
.
Philip Resnik, Professor Department of Linguistics and Institute for Advanced Computer Studies 1401 Marie Mount Hall UMIACS phone: (301) 405-6760 University of Maryland Linguistics phone: (301) 405-8903 College Park, MD 20742 USA Fax : (301) 405-7104 http://umiacs.umd.edu/~resnik E-mail: resnik [AT] umd _DOT_ edu UMIACS office: AV Williams 3143 By far the best way to reach me is by e-mail to resnik [AT] umd _DOT_ edu.