# Project Milestone 2: Two Proposals

**Due: March 4 (submit PDF on Blackboard)**

For this milestone, submit two project proposals. The purpose of submitting two is to give you options — I'll give feedback on both, and you'll commit to one (or a hybrid) in M3. Your direction may evolve, and that's expected. The milestone structure exists to catch these pivots early.

If you plan to work with a partner, list both names at the top. All members should submit the same document.

## For each proposal, address the following (1–2 paragraphs each is fine):

### 1. Working title

### 2. Research question

This is the most important part. A good research question is *testable* — your analysis should be able to support or refute an answer. You don't need a formal hypothesis yet, but you need to be heading toward one.

Some guidance on framing:
- A research question is **not** the same as a project description. "I want to analyze a corpus of tweets" describes a project, but it doesn't ask a question. What *about* those tweets? Do speakers of different dialects use hedging strategies differently? Does vocabulary diversity change over time in a particular genre?
- Try phrasing it as: *"Does [linguistic pattern] differ between [group/genre/context A] and [group/genre/context B], and in what way?"* You don't have to use that exact template, but it can help.
- It's okay if your question is broad at this stage. M3 is where you'll sharpen it into a specific hypothesis.

Examples of good research questions for this course:
- *Do different genres of the Brown corpus differ in their use of passive constructions, and if so, what might explain the differences?*
- *How does vowel space vary between two regional dialects of English, based on formant measurements?*
- *Is there a measurable difference in syntactic complexity between spoken and written registers in a Universal Dependencies treebank?*

### 3. Data

Be as specific as you can. Ideally, name an existing dataset and provide a URL or citation. If you're planning to collect or compile data, describe where it would come from and roughly how much you'd need.

If you're not sure where to find data, here are some starting points:
- [NLTK Corpora](https://www.nltk.org/nltk_data/) — Brown corpus, Gutenberg, and many others we've used in class
- [Universal Dependencies](https://universaldependencies.org/) — annotated treebanks for many languages
- [LDC](https://www.ldc.upenn.edu/) — Linguistic Data Consortium (UR has institutional access to some resources)
- [CHILDES](https://childes.talkbank.org/) — child language acquisition data
- [Open Speech corpora](https://openslr.org/) — speech recordings and transcripts

A concrete data plan is important. Projects that stall often do so because the data turned out to be harder to obtain or messier than expected. If you're uncertain about data access, say so — I can help you evaluate feasibility.

### 4. Methodology

What analysis would you do? Think about what tools and techniques from this course you'd apply.

Think about:
- What is your baseline for comparison? (e.g., comparing across genres, languages, time periods, speaker groups)
- What would you measure, and how? (e.g., frequency distributions, POS patterns, acoustic measurements, statistical tests)
- How would you know if you found something meaningful?

You don't need a fully fleshed-out analysis plan — that comes in M3. But I should be able to see the shape of what you'd actually *do*.

### 5. Significance

Briefly: why does this matter beyond the scope of a class project? What bigger question does it connect to? What would we learn about language from the answer?

## Formatting and logistics

- Submit as a single PDF on Blackboard
- No strict length requirement, but 1-2 pages total is typical
- Both proposals in the same document
- If working with a partner, both members submit the same document

I'll give written feedback on both proposals. Use that feedback when narrowing to one idea for M3 (Abstract + Completion Plan, due March 20).
