Faq

Frequently Asked Questions

I want to work with you, and I'm currently a student at UMD

First, take computational linguistics, machine learning, or digging into data. Once you've done that, drop me an e-mail, and we'll figure out a project to work on together. After we finish the project, we can discuss longer-term arrangments.

I want to work with you, and I'm not currently a student at UMD

Then you should apply to be a student at the University of Maryland. The best way would be to apply to the College of Information Studies and mention me specifically in your application. I do also work with computer science students; if that is more your specialty, you should apply to computer science and mention the CLIP lab (and me) in your application. After you submit an application, please drop me an e-mail with your CV letting me know you applied. I may not reply, but it's still very useful!

What are your expectations / preferences in terms of what a student should know?

I personally like C++ and Python, but the culture here leans to Java, which I've been using more and more (and likely will continue to). I prefer writing tests to debugging, but debugging is a necessary evil. I do like reinventing the wheel somewhat to keep things self-contained and consistent, but I contribute the result to things like NLTK so that other people don't have to do the same. I also like using style checkers and the like to keep myself organized. (Though I say this, you can get a more honest picture of my coding style by looking at what I've actually written.)

Anyhow, you probably can already code in some language pretty well, and conforming to my coding style will increase the probability that I'll be more hands-on in helping you code and debug, but if you want to program in LISP or Prolog, that's perfectly fine too, as long as it works for you.

Being comfortable with probability is probably the more important requirement. You'll likely have to deal with messy probability distributions, take expectations, derive conditional distributions given a joint distribution, implement dynamic programing to sample from PCFG grammars, do Taylor approximations, do some optimizations, etc. This shouldn't be taken as a laundry list of things you should know (it's great if you do) but just as a heads up of the kinds of things you might run into; part of a graduate education (life, for that matter) is learning new stuff. There will be many opportunities to learn: from classes (at Maryland: NLP 1 & 2, Machine Learning, and Cloud Computing), your peers, and reading group.

I think attending (and contributing to) a reading group or two is critical for learning about a field and being a good scholar; it's fun and not a chore at all, but I want to be up front in saying that any student of mine should be an active participant in a reading group or two.

How do you interact with students?

Likely some sort of a group meeting every week with students, and a standing weekly one-on-one meeting that can be cancelled if it isn't needed. I also like low-key weekly reports. But I would like to be kept in the loop, with students dropping my and e-mailing when they have questions.

How should you put LaTeX documents in a version control system?

If you're actively editing a file, you should only put in the files needed to generate the output. Do not store the intermediate or final output (pdf, log, aux, etc.). This is not just to save space (every little change, no matter how small, will force a multi-megabyte file to be saved) but also because it can get in the way of other people as they're compiling the document.

I also prefer using pdf and png files for graphics (as appropriate); this is because the output of pdflatex is usually better/easier than that of latex. I also think that ggplot2 is the best (free or otherwise) tool for generating plots.

What's up with your name? Why is it hyphenated? Why is your OIT username ying? What should I call you?

My parents' last names are Boyd and Graber. When I was born they hyphenated (why people whose nicknames were "Toni the Body" and "Little Grabber" would do so is beyond me; my nickname is obvious). As a result, I am deeply, personally, against hyphenating names. Don't do it. It's not a sustainable practice, and it leads to all sorts of problems. People think my last name is just "Boyd" or "Graber", web forms don't think I have a valid name, and there's only about a forty percent chance someone will get my name right after one telling.

When I first got to Maryland, I was doing a postdoc in UMIACS, and it didn't look like I'd use my OIT account for anything. So I chose "ying", my wife's family name. Then I became faculty, and now it gets used for tons of stuff, but it would likely be a huge hassle to change it.

Most people call me Jordan, which is just fine by me. I also answer to JBG.

How do I get to UMD on the train?

We're on the Northeast Corridor line of Amtrak. To buy a ticket, go to Amtrak and buy a ticket. The closest station geographically is New Carrollton. There are two ways to get from New Carrollton to UMD - you can either take the F6 bus (note that it does not run on weekends) or the UMD shuttle (which sometimes checks for a UMD ID). If you can't use either of those connections, you might want to consider going to Union Station in DC and taking the Metro to College Park (take the Red line from Union Station, transfer to the Green line at Gallery Place-Chinatown).

Note that College Park is also on the MARC Camden Line. This is useful if you're in Baltimore or DC. The line doesn't run frequently, but if it matches up with your schedule, it's a great way to get to College Park.

Note that if you're starting from New Jersey (e.g. Princeton or New Brunswick), then if you ask the Amtrak website, it might seem that there are no trains. There are, but they just won't stop for you. Instead, you first have to take NJ Transit from one of those locations to Trenton or Metropark (Trenton is a much nicer station, and there's no backtracking if you're going south).

Buy your Amtrak tickets early, as they do sell out. Also note that prices are higher the longer you wait. If you're going to do this at all frequently, it's a good idea to buy a Student Advantage card.

What's your Erdös number?

While at Princeton, I was also involved with the aphasia project with Maria Klawe (as a result, I have an Erdös number of 2). My Bacon number is as yet undefined (call me).