tlc wordmark
tlc logo

Jason M. Baldridge


Ph.D., University of Edinburgh

Contact

Courses


LIN 313 • Language And Computers

40120 • Spring 2016
Meets TTH 9:30AM-11:00AM CLA 0.104
QR

This undergraduate class looks at everyday tasks that involve natural language processing: document classification, spelling and grammar correction, dialogue systems, machine translation, cryptography and forensic linguistics. Students will get insight into the how these systems work (and why it is still so difficult to do natural language processing well). We also consider social and ethical considerations such as privacy, job creation and loss due to language technologies, and the nature of consciousness and machine intelligence.

 

LIN 389C • Rsch In Computatnl Linguistics

40085 • Fall 2015
Meets F 2:00PM-5:00PM CLA 4.422

This course will be a combination of discussion and presentations and will cover topics such as recent trends in computational linguistics and machine learning and software and tools for data analysis. We will focus on research design, data sources, presentation and publication of research data and analysis, and dissertation writing. The course will give students the opportunity to pursue their own research in a guided, collegial environment. Graduate students from other departments are welcome!

C S 395T • Applied Natrl Lang Processing

53730 • Spring 2013
Meets MW 1:00PM-2:30PM GAR 1.134
(also listed as LIN 386M)

Advances in computational linguistics, machine learning, and computer hardware over the last two decades have produced a powerful set of tools and capabilities for the automatic processing of natural language texts. We now have access to a massive quantity of free-form natural language text available on the Internet and in large text collections (including out-of-copyright books). Increasing quantities of text in a wide variety of languages are being produced everyday through news, blogs, and social media. Consequently, the ability to process natural language to categorize and cluster texts, find and visualize patterns in them, or to even just find them at all has become increasingly important. A wide variety of disciplines---from linguistics to psychology to archeology to literature and beyond---are coming to rely on natural language processing tools that enable them to ask new questions of corpora of interest to them. This is particularly evident in the ascendancy of digital humanities, where researchers would often like to be able to identify interesting patterns in corpora that are too large to be manually inspected. There is also a great deal of commercial interest in systems that can process unstructured textual data to extract, categorize, and present the information contained in it, and in some cases, to use it to predict things about the real world, such as the expected opening day revenues for movies based on social media chatter.This class will provide instruction on applying algorithms in natural language processing and machine learning for experimentation and for real world tasks, including clustering, classification, part-of-speech tagging, named entity recognition, topic modeling, and more. The approach will be practical and hands-on: for example, students will program common classifiers from the ground up, use existing toolkits such as OpenNLP, StanfordNLP, Mallet, and Breeze, construct NLP pipelines with UIMA, and get some initial experience with distributed computation with Hadoop. Guidance will also be given on software engineering, including build tools, git, and testing. It is assumed that students are already familiar with machine learning and/or computational linguistics and that they already are competent programmers. The programming language used in the course will be Scala; no explicit instruction will be given in Scala programming, but resources and assistance will be provided for those new to the language.

LIN 313 • Language And Computers

40915 • Spring 2013
Meets MWF 10:00AM-11:00AM GAR 1.126
QR

This undergraduate class looks at everyday tasks that involve natural language processing: document classification, spelling and grammar correction, dialogue systems, machine translation, cryptography and forensic linguistics. Students will get insight into the how these systems work (and why it is still so difficult to do natural language processing well). We also consider social and ethical considerations such as privacy, job creation and loss due to language technologies, and the nature of consciousness and machine intelligence.

 

LIN 313 • Language And Computers

40750 • Fall 2012
Meets TTH 11:00AM-12:30PM JES A215A
QR

This undergraduate class looks at everyday tasks that involve natural language processing: document classification, spelling and grammar correction, dialogue systems, machine translation, cryptography and forensic linguistics. Students will get insight into the how these systems work (and why it is still so difficult to do natural language processing well). We also consider social and ethical considerations such as privacy, job creation and loss due to language technologies, and the nature of consciousness and machine intelligence.

 

LIN 386M • Applied Text Analysis

40885 • Spring 2012
Meets W 11:00AM-2:00PM PAR 10

Advances in computational linguistics, machine learning, and computer hardware over the last two decades have produced a powerful set of tools and capabilities for the automatic processing of natural language texts. We now have access to a massive quantity of free-form natural language text available on the Internet and in large text collections (including out-of-copyright books). Increasing quantities of text in a wide variety of languages are being produced everyday through news, blogs, and social media. Consequently, the ability to process natural language to categorize and cluster texts, find and visualize patterns in them, or to even just find them at all has become increasingly important. A wide variety of disciplines---from linguistics to psychology to archeology to literature and beyond---are coming to rely on natural language processing tools that enable them to ask new questions of corpora of interest to them. This is particularly evident in the ascendancy of digital humanities, where researchers would often like to be able to identify interesting patterns in corpora that are too large to be manually inspected. There is also a great deal of commercial interest in systems that can process unstructured textual data to extract, categorize, and present the information contained in it, and in some cases, to use it to predict things about the real world, such as the movement of stock prices.

This class will provide a practical introduction to many of the core algorithms in natural language processing and machine learning that are useful in a wide variety of text analysis applications, such as authorship attribution, sentiment analysis, information extraction and geolocation. We will cover algorithms for clustering, classification, part-of-speech tagging, topic modeling and named entity recognition, as well as evaluation methodologies for evaluating their success and methods for visualizing their outputs. The course will include an introduction to the programming language Scala, which will be used for homework assignments. Assignments will provide experience with the methods as well as experience with popular open source toolkits such as Apache OpenNLP and Mallet. No prior programming experience is assumed.

More information can be found at the course website: http://ata-s12.utcompling.com/

LIN 386M • Intro To Computational Ling

40778 • Fall 2011
Meets W 12:00PM-3:00PM PAR 10

Advances in computational linguistics have not only led to industrial applications of language technology; they can also provide useful tools for linguistic investigations of large online collections of text and speech, or for the validation of linguistic theories.Introduction to Computational Linguistics introduces the most important data structures and algorithmic techniques underlying computational linguistics: regular expressions and finite-state methods, context-free grammars and parsing, feature structures and unification, taxonomies, distributional representations and pattern-based approaches. The linguistic levels covered are morphology, syntax, semantics and lexical semantics. While the focus is on the symbolic basis underlying computational linguistics, a high-level overview of statistical techniques in computational linguistics will also be given. We will apply the techniques in actual programming exercises, using the programming language Python and the Natural Language Toolkit. Practical programming techniques, tips and tricks, including version control systems, will also be discussed.Course site: http://icl-f11.utcompling.com

C S 378 • Natural Language Processing

53570 • Spring 2011
Meets MWF 12:00PM-1:00PM CBA 4.328
(also listed as LIN 350)

In the age of the Internet, there is a considerable demand for technology helping users to manage, search and access the enormous amount of information that is available. There is also a need for speech interfaces to computer systems of various types, from tutoring systems to automated customer support lines to robots. Examples of language-technological applications are the identification of the correct sense of an ambiguous word like “bass” (fish or musical instrument), automatic recognition of the language in which a document is written, machine translation, and automatic extraction of information from documents.

The field of computational linguistics deals with both the science behind providing such capabilities and the actual creation of applications which implement them. This course discusses the main natural language processing applications and provides an introduction to the key representations and algorithms used in computational linguistics and to the use of machine learning for NLP tasks.

The course will be oriented towards hands-on experience of language processing techniques. Previous programming experience is required.


Course Requirements
Assignments (70%): A series of six assessed, equally-weighted assignments will be given out during the semester.
Mid-term Exam (15%): There will be a mid-term exam over the material covered during the first half of the semester.
Final Exam (15%): There will be a final exam over material covered after the mid-term.
The course will use plus-minus grading, using standard scales. Attendance is not required, and it is not used as part of determining the grade.


Syllabus and Text

Here is the syllabus: http://comp.ling.utexas.edu/courses/2010/spring/natural_language_processing

The official course text book: Jurafsky, D. and J. H. Martin. Speech and language processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition, Second Edition, Upper Saddle River, NJ: Prentice-Hall, 2008.

LIN 313 • Language And Computers

41090 • Spring 2011
Meets MWF 10:00AM-11:00AM PAR 206
QR

This undergraduate class looks at everyday tasks that involve natural language processing: document classification, spelling and grammar correction, dialogue systems, machine translation, cryptography and forensic linguistics. Students will get insight into the how these systems work (and why it is still so difficult to do natural language processing well). We also consider social and ethical considerations such as privacy, job creation and loss due to language technologies, and the nature of consciousness and machine intelligence.


Course Requirements

Assignments (45%): A series of six assessed assignments will be assigned during the semester. The lowest grade will be dropped, so each homework that counts is worth 9%.

Essay (15%): A 1000-1500 word essay on a topic dealing with the social implications of computational applications for language.

Mid-term Exam (20%): There will be a mid-term exam on October 15 over the material covered in class up to October 6.

Final Exam (20%): The final exam will be given during finals week and will cover all course material.

The course will use plus-minus grading, using standard scales. Attendance is not required, and it is not used as part of determining the grade.


Syllabus and Text

Syllabus is here: https://sites.google.com/site/languageandcomputersfall2010/

There is no official course text book for this course since the topic is quite new. Some readings will be assigned and made available for download or copying.

C S 395T • Semi-Supv Learn For Comp Ling

52702 • Fall 2010
Meets TTH 2:00PM-3:30PM PAR 101
(also listed as LIN 386M)

Course Description

The field of computational linguistics has undergone a major shift over the last two decades toward statistical methods. For some tasks, such as language modeling, there is a wealth of data available for training models, but for many tasks, the performance of models is severely limited by the amount of relevant labeled training material. Semisupervised learning seeks to use small amounts of annotated data in combination with (possibly) large amounts of raw text to improve performance over just using the annotated data by itself. This class will look at the theory and methods behind semisupervised learning methods in the context of computational linguistics.

Texts

Abney. Semisupervised Learning for Computational Linguistics. We will also make use of other readings from books, articles and lecture notes, which will be made available on the course website.

LIN 313 • Language And Computers

40675 • Fall 2010
Meets TTH 11:00AM-12:30PM PAR 1
QR

Course Description

This undergraduate class looks at real world tasks that involve natural language processing: authorship attribution, text classification, spelling and grammar correction, machine translation, cryptography and some others. Students will get insight into the how these systems work (and why it is still so difficult to do natural language processing well). We also consider social and ethical considerations such as privacy, job creation and loss due to language technologies, and the nature of consciousness and machine intelligence.

Grading Policy

Assignments (8% each): A series of six assessed assignments will be assigned during the semester. Essay (12%): A 1000-1500 word essay on a topic dealing with the social implications of computational applications for language. Mid-term Exam (20%): There will be a mid-term exam on October 15 over the material covered in class up to October 6. Final Exam (20%): The final exam will be given during finals week and will cover material discussed after the midterm. The course will use plus-minus grading, using standard scales.

Texts

There is no appropriate textbook for this course. Reading material will be made available through the course website.

C S 378 • Natural Language Processing

54345 • Spring 2010
Meets MWF 11:00AM-12:00PM CBA 4.328
(also listed as LIN 350)

LIN 312 • Language And Computers

41090 • Spring 2010
Meets MWF 10:00AM-11:00AM PAR 206
SB

LIN 393S • Categorial Grammar

41140 • Spring 2007
Meets TTH 12:30PM-2:00PM CBA 4.340

LANGUAGE, POWER & ACTION: A SEMINAR ON HIDDEN MEANING

Hans Kamp/David Beaver

Prerequisites

Graduate Standing and Consent of Graduate Advisor or instructor required.

Course Description

This class will cover a central topics in Philosophy of Language and Linguistics: the effects of context on meaning, and the way in which meanings index that context. We will construe indexicality†broadly, to refer to all aspects of meaning that are dependent on context. Thus indexicality covers a huge range of expression types and usages. It includes:

● paradigmatically indexical expressions like I† and now,

● expressions described as demonstrative or deictic,

● anaphoric expressions,

● modals and quantifiers for which the domain is contextually restricted,

● predicates of personal taste,

● discourse and dialogue markers that index aspects of the utterance situation or discourse, and

● various markers of perspective, subjective attitude and social relationship among interlocutors.

What all these phenomena have in common is that they challenge a simplistic view of meaning as a mapping from form to meaning. Meaning isn’t determined by form alone but by a combination of form and context. The general question this gives rise to is: What aspects of meaning are affected by context, and what aspects of meaning are invariant? And even more generally, what do indexical phenomena tell us about the use and interpretation of language, and the nature of the semantic/pragmatic interface?

We will consider these issues from both philosophical and linguistic perspectives, discussing both philosophical frameworks (e.g. two dimensional approaches to meaning, centered worlds), and linguistic generalizations (e.g. about pronominal systems, spatial deixis, and temporality).

Topics

  1. Indexicality, Deixis and Anaphora: Demarcation of the notions and introduction to major accounts of them: Kaplan, Stalnaker (& Dynamic Semantics), Nunberg (1993).
  2. Context Dependence: Forms it takes + nature of context and contextual information
  3. Dimensions of indexicality: temporal, spatial, domain restriction, scalarity (e.g. the adverbs ‘too’, ‘enough’), empathy, honorifics, expressives,’’
  4. ‘Reference’ to the self
  5. Predicates of personal taste: non-truth conditional? Also: Contextualism vs. relativism.
  6. Implicit arguments.

Publications


 

For a complete list of publications, click link below:

http://www.jasonbaldridge.com/papers

Profile Pages


External Links