309 Comments
Jul 18, 2023Liked by Maya Bodnick

The answer is simple: have Texas Instruments release a simple LLM and then force students to buy that same largely unmodified LLM for the next 40 years at an inflated price.

Expand full comment
Jul 18, 2023·edited Jul 18, 2023Liked by Maya Bodnick

Wonderful essay, Maya. Yet another piece of evidence that Harvard has lax standards ;)

Expand full comment
author

You're always coming for my school in these comments Milan smh

Expand full comment

They hated Jesus because he spoke the truth

Expand full comment

Just give Milan one of these if he keeps doing it:

https://comb.io/SrnJ6M

Expand full comment
Jul 18, 2023Liked by Maya Bodnick

My immediate reaction was that this sounds like the product of rampant grade inflation and that I absolutely don’t want my own child at Harvard.

A C for a paper with no coherent thesis or argumentation? I’d have been failed out of any of my writing, theology, philosophy, or history classes for writing such stuff.

Expand full comment
Jul 18, 2023Liked by Maya Bodnick

I am just dirty big-state-school chaff, but I would have preferred to think that the Hahhvahd Difference (TM) would be an instructor who had enough time and patience to say "this is an F- as written, for X, Y, and Z reasons--do it again"

The end result might look like grade inflation (everyone gets at least a gentleman's C!), but it would be the result of iterating on bad work until it was not bad anymore.

Expand full comment
Jul 18, 2023Liked by Maya Bodnick

Humanities grad student at Harvard here, and yes, the grade inflation is really that bad. The only grades lower than an A- I've given have been to students who didn't even turn in major assignments - and those students still got at least a C. It's already enough of a pain defending myself against students when I give an A-. My first semester, when I didn't realize *how* intense the inflation was, I gave lots of Bs and B+s for work that easily could have been Cs (knowing that the students had the opportunity to rewrite for a better grade, no less!). I caught on by the end of that semester, but my first round of teaching evals has more than one complaint about "harsh grading." The incentives on grad student TFs, who are often doing the bulk of the grading, are probably an underrated component of why grade inflation has gotten so out of control.

Expand full comment
Jul 18, 2023·edited Jul 18, 2023Liked by Maya Bodnick

My experience as a TA at a peer school was different. We didn’t have a literal curve but we aimed for a higher B or lower B+ median. I had only a handful of students complain (fewer and fewer as o gained more experience) and though these confrontations were never fun I ALWAYS felt like my professors, and the university, had my back. My say on the grades was final, and the students realized that. Is your experience very different?

P.S.

Incidentally complaints were mostly by the B+ students, sometimes by the A- students. The C and below students (which we had in every class, to be clear) never complained, though I would try to seek them out to see where I could help them improve.

Expand full comment
Jul 18, 2023Liked by Maya Bodnick

So I think it varies wildly by the professor internal to Harvard, and obviously institutional cultures will be different here. Basically all of my professor start the course by telling students that if they put in the work they will get an A.

But i also think that institutional commitments to not regrading are kind of orthogonal to the problem. I did have one professor commit to backing me up on grading and going so far as to say he'd offer to regrade students' papers more harshly if they complained, but that doesn't really matter if I'm still going to get blasted in the teaching evaluation if students feel like they were entitled to an A. (And it's self-reinforcing - it's not like all of my students are entitled brats, they're just anxious and correctly perceive that because there's so much grade inflation and everyone knows it, if they were to get a B it would look to an outside observer like they really did D-level work.)

But I did my undergrad at MIT, notoriously not grade-inflating, and I got plenty of Cs and wouldn't have dreamed of giving a TF a bad eval for that.

So basically I think once the grade inflation is in place, all the incentives are misaligned and it's hard for any individual action to change it, even if students' complaining doesn't ever translate into actual grade changes.

Expand full comment
Jul 18, 2023Liked by Maya Bodnick

How important are student evals for a TA?

My experience swung wildly based on school/program, but curious how that differs at Harvard.

Expand full comment
Jul 18, 2023Liked by Maya Bodnick

P.S.

>> Basically all of my professor start the course by telling students that if they put in the work they will get an A.

I guess it depends what you mean by “put in the work”. I was consistently taught that work that would require reasonable effort from the average student and producing average ideas gets you a B+. To get into the A range you need to actually go beyond that and display some cleverness, originality etc. importantly however one of the things I constantly needed to explain to my students is that grade aren’t a judgment of their moral worth but of the quality of the work they produced. It’s not about how much effort you, personally, put in but rather how it came out. We don’t grade you on your exertion. Thus I don’t personally like the “put in the work” phrasing though I get why it’s being said.

Expand full comment
Jul 18, 2023Liked by Maya Bodnick

Yes, I agree the institutional culture is the most important factor. However there are always bounds and you have a wiggle room within them.

Expand full comment

I don’t think MIT and Harvard have dramatically different grading standards, at least in the T parts of STEM

Expand full comment
Jul 18, 2023Liked by Maya Bodnick

Yeah, this needs to be fixed at an institutional level -- implement frequent audits and punish (I'm not sure how?) departments that fail to meet them.

Failing multiple classes my sophomore years is one of the best things that happened to me in college (although competent mental health counselors might have made it unnecessary) -- it was the rock bottom that I needed to make necessary changes

Expand full comment
Jul 18, 2023Liked by Maya Bodnick

BRING BACK THE CURVE! GRADE ON THE BELL!

ONLY THE STRONG WILL SURVIVE!

Expand full comment

As a Brit, I can't imagine ever having a single person marking anything (that counts to the final grade; this is why there are so many essays and so on that don't count to the final grade). There would always be a second person producing a mark (normally a percentage), and you'd usually just average them unless they were radically different. If they are radically different, then the two academics need to have a discussion whether they just have a difference of opinion on the given essay, whether one of them missed something or whether they are applying different standards (in which case they need to get with the institutional standard).

Translating a mark into a grade (in UK terms, going from 65% to a 2.1; in US terms to a B+ - note that UK percentages are lower because the percentage scale for an essay is meant to be the same for graduate work, so a PhD thesis for the same topic might get 95% and clearly even the best undergraduate work is going to be 75% on that scale) is a complex thing that involves both multiple academics from the department and the External Examiner (an academic seconded from another university, involved to try to ensure that the same grades mean the same thing right across the country).

Expand full comment
author

I'm also concerned about grade inflation and this seems to be another (incidentally) revealed phenomenon in my article. Having 2 graders seems like a good solution for accountability.

Expand full comment
Jul 18, 2023Liked by Maya Bodnick

What are the faculty to student ratios like in the UK? My grad student teaching load is positively sparse compared to peers at state schools, and I still barely have enough time to give thorough feedback to all of my students. I can't really imagine having double the load! (Are grad students doing the grading in the UK?)

Expand full comment
founding

My friends who work at British university complain about all the excess work they have to do, submitting exams a semester or two in advance and grading work from other people’s classes and the like. Some of it works for quality control (though some of it is counterproductive - sometimes you want the exam to reflect discussion topics that came up in class, or in response to current events) but it’s not at all obvious it’s worth all the extra work that goes into it.

Expand full comment
Jul 18, 2023Liked by Maya Bodnick

I remember failing 60 percent of students when grading Danes for the first time. I got a panicked response from a senior faculty member because the department is paid based on the number of students who pass exams (Danish system is a joke.)

Expand full comment

Why bother with grades at all if it’s like that? I get the impression that these are kids who are accustomed to getting nothing less than an A, but if the A is automatic the whole idea of grading seems silly.

Expand full comment

My GPA was a low B from a small liberal arts school that was playing around at the time with not letting students ever see their grades. Then the graduate school transcript had no grades, just Pass/Fail - another experiment of the times. Finally, post-bacc computer science and calculus grades weren't that great either because those profs keep to strict curves. When I applied for the Peace Corps forty years later I had to defend the GPA since it fell below their standards. I mentioned curves and referenced my successful career (which clearly indicated I wasn't stupid.)

Even if I had seen my college grades, I probably couldn't have improved them much since my public high school education left me intellectually unprepared for the college's writing and participation standards. Included in the curve were private school students who had had much more practice and tutoring. (Such a disadvantage can be made up later, however!)

Expand full comment

I had a very similar experience as a TA at another elite school about a decade ago. Students that did nearly nothing and did not learn a bit of the course content had a floor of C grades, and I was also shocked at how resistant professors were to saying anything to students who were obviously cheating (meaning obvious, “turned in the professor’s own work as their exam”-type cheating).

Grades are already pretty broken as a ranking mechanism.

Expand full comment
Jul 18, 2023Liked by Maya Bodnick

As fellow "dirty state school chaff" (I suspect we may share alma maters...) who is now faculty at a brand name fancy-pants private school... this very much feels like one of the important differences. The resources, and thus time and engagement level of the faculty/instructors in support of the students is orders of magnitude above and beyond. Is this coddling? Or just a great educational experience? The students I interact with are almost uniformly very impressive and fantastically well-prepared. I would hire or recommend almost all of them!

*I do not really interact with undergraduate students, only grad and post-grad students. But I can compare my now to my experiences at (good!) large public schools for my own education.

Expand full comment

Ew, unwashed masses people are yucky.

As to your thesis, that’s what I’d have expected too… both my parents went to Penn State and felt Notre Dame coddled me enough as is because of the personalized attention we got, even though it was more rigorous than their coursework too.

Apparently Harvard just hands out B’s for failure without seeking to rectify the problem.

Expand full comment

As I understand it, Notre Dame is one of those bizarre institutions that focuses on educating the undergrads (?????), rather than an objectively correct, Big-Ten-ish understanding that undergrads are only around so we have TA positions for our beginning graduate researchers.

Expand full comment

I’m only now coming to realize how radical that notion, and the use of its endowment for actual financial aid, truly are, a decade after graduating.

Expand full comment
Jul 18, 2023·edited Jul 18, 2023

I got a B with a 92% in ME354 at Purdue. I'm still pissed about it 20 years later.

https://www.purdue.edu/freeform/me354/wp-content/uploads/sites/28/2022/07/220503_ME354_Final_Exam_solution.pdf

Expand full comment

Professors do not care about undergraduates. A very small % of them are bright enough to actually get their attention.

Professors just want to research and write papers

Expand full comment
deletedJul 18, 2023Liked by Maya Bodnick
Comment deleted
Expand full comment

in his defense, Señor Chang just cared very deeply about giving you a quality education...

Expand full comment

I think you might be overrating the writing abilities of the average human

Expand full comment

The average human, I agree, but aren't you Ivy types supposed to be decidedly unaverage? :p

Everyone keeps posing ChatGPT as a threat to smart people who write for a living, which I cannot see at present. That it can replace the average high school graduate's writing is not really a point of contention.

Expand full comment

Also, does the report button finally go to you now?

Expand full comment

Nope

Expand full comment
Jul 18, 2023·edited Jul 18, 2023

FFS, it now claims it does lol.

Can't seem to get the link function to work for a specific comment but this was in the Mailbox chat yesterday:

Robert M.

19 hr ago

Why don't you start with resigning your US Citizenship? I hear the WEF has global-directed programs which might be more to your liking. They don't have their own flag, but maybe you could recommend they get one.

Expand full comment

Email me the link to his comment and I’ll address it when I get home

Expand full comment
Jul 18, 2023Liked by Maya Bodnick

Yes, grade inflation is a real thing. You can and do give Fs but it’s very rare indeed (slightly less so in language classes).

Expand full comment

I find it surprising but to some extent, this makes sense. Harvard has a really effective filter in place in its admissions.

In contrast, my undergrad (fairly well-regarding engineering program at an ok-but-not-great huge state school) took basically the opposite approach: admit a lot of people, then have 60-70% attrition among aspiring engineering majors as a filter.

Expand full comment

Harvard could give an F or C for submitting 100 pages of random 0s and 1s, and it would have no impact on educational quality. Assigning terrible grades to the long left tail of the worst submissions is just admin work for no gain

Expand full comment

That’s an interesting theory to which I don’t subscribe.

Expand full comment

A C is 2+ sigma below the mean for Harvard - the (very real) grade inflation means that it's a really bad grade! The letters themselves have no intrinsic meaning

Expand full comment

Perhaps professors should keep two sets of books: one is the grade (B through A+) which they give to the students, and the second is the one in which they translate those grades: B=F, B+=D, A-=C, A=B and A+=A.

If you get the same distribution of student grades, just across a "narrower" range of these arbitrary letter designators, does it really matter? As long as there aren't huge differences among classes.

Expand full comment

It matters because there are differences between classes, unfortunately. Taking hard STEM is much riskier, especially if you’re not a major, which pushes students away from the most interesting classes

Expand full comment

Did you just write out “sigma” in the Latin alphabet to signify “standard deviation?”

Expand full comment

\sig would be how I usually write it, wanted to be clear ;)

Expand full comment
Comment deleted
Expand full comment

A lot of your posts here over the last few weeks add up to the notion that the mediocre-or-worse scions of the professional classes are in some way entitled to a professional class job and standard of living themselves.

Suffice it to say, I disagree in the strongest possible terms.

Expand full comment
Comment removed
Expand full comment

Maitland Jones Jr.

Universities hate security of employment for non-tenure-track faculty. But this guy's story provides at least one argument in favor of just cause rights after a probationary period.

Like many of you, I work in higher ed (I teach in a STEM field). Even at the upper-tier state school where I work, grade inflation is high and student entitlement is higher. I now routinely budget several days after grades are submitted for dealing with students' grade complaints, and they're dominated by the A-/B+ crowd who "feel" they "deserve" an A.

(Maybe someday we can get a full post on grade inflation in higher ed!)

Expand full comment
Comment removed
Expand full comment

Interestingly, at least at my flagship state university's college, the admissions standards are almost certainly *lower* for out-of-state students. These out-of-state students are majority international, mostly from Vietnam, China, and India; most of the rest are US citizens of Asian descent. Out-of-state students make up ~745 students out of the ~1500 undergraduates in my college's bachelor's programs. As close as possible without being majority out-of-state.

Why so many international students? Because, perversely, I teach at a state school. And as the joke has gone for 30+ years, no state school would have budget problems if all of its undergraduate students all paid out-of-state tuition. One way to approach that is to differentially admit more out-of-state students to desirable, competitive majors (CS, engineering, business). And one way to do that is to lower standards for international students, who generally get negligible-to-no financial aid and are desperate for a way into the US.

To be clear, I love all my students but it's so obvious that many of these international students, despite desperately wanting to succeed and working their butts off, are just totally unprepared on a variety of axes for life at a large residential university in the US.

Faculty have no say in admissions, at least where I am; it's a purely administrative matter. The only knob we can turn is to request a higher or lower target in our incoming class, and the admissions office "balances" the incoming class in accordance with the many constraints the upper-level administration imposes. And clearly, "get more international students who pay full freight" is one of those constraints.

Expand full comment

what do you expect Harvard is just a football school....

Expand full comment

Like, why is “no analysis” a B? Why isn’t that an F? I guess you get what you pay for.

Expand full comment

What's funny to me is that the graders knew this was an experiment, and the paper was either written by AI or by a student who hadn't even taken the class and who would not protest the grade because it wasn't real. Why bother inflating grades except out of pure habit/loss of perspective?

Expand full comment

The obvious next step is for you to repeat this experiment at Yale.

Expand full comment

I winced at the UC Berkeley grade deflation due to personal experience.

Expand full comment
Jul 18, 2023Liked by Maya Bodnick

My reaction is that the points about how weak ChatGPT was on analysis, substance, and argument are really important and make me optimistic about the role of humans. I think we have a weakness for well-written prose that could be overcome as that becomes abundant. The future is in being super-rational and focusing on the substance. Stop giving credit for well-written work, just as you wouldn't say "well this has no typos so I guess it must be right."

Expand full comment
author

Sure, but this was ChatGPT's performance 9 months after its release. It'll probably get much, much better

Expand full comment

I am skeptical about the ceiling of LLMs and neural networks. Once you run out of training data or use AI generated data (basically no knew information is being added to the system), things can break.

The ability of Chat GPT to produce convincing human quality text at zero cost (once trained) will improve. This means noise. That is the problem,

https://readnoise.com/

Expand full comment

the biggest question people should be asking is whether we are near the "diminishing returns" part of the improvement curve.

I guess, I am skeptical at how you can take in large bodies of text and end up writing better than the inputs? So all the fretting about how "chatGPT will be better than us in 4 months at this rate" seem potentially misguided. The ceiling might be writing like an 80th percentile human or something. Which is amazing, don't get me wrong, but not total life-changing magic.

But we'll see - it's not my area of expertise.

Expand full comment

The early Chess and Go engines were all trained on human data and went on to greatly surpass what any human was capable of. And then the ones trained only on self-generated AI data went on to surpass the engines trained on games from humans.

Now obviously writing a thoughtful and persuasive policy analysis is an extremely different skill from winning even a very complex board game. It makes sense to be skeptical that AI could end up writing better than its inputs, but it's still possible that it might do exactly that.

I'm just one of the liberal arts types whose job will get automated away by all this so I'm by no means an expert. But I do know people in the field who did not think scaling up from GPT-2 to GPT-3 was going to have the kind of leap in intelligence and reasoning that we saw there. To vastly oversimplify things: training on large enough datasets really does seem to have emergent properties that are greater than the sum of their parts,

Expand full comment

Although one big difference is you don't need a very advanced model to "score" your chess/go games. At a minimum, did you win?

What helps judge the quality of chat gpt output so it can iterate?

(Students submitting it to Harvard graders?)

Expand full comment
Jul 18, 2023·edited Jul 18, 2023

Kindof ... It was in development for 2 years before release. It's very TBD where we are on the lifecycle. We could be approaching the limit of what an LLM can do.

EDIT: See other post for this Bard example ... "So, the correct torque is 400 ft-lbs * 12 = 1381 in-lbf." Which is hilariously wrong.

Expand full comment
Jul 18, 2023Liked by Maya Bodnick

In my experience across 30 years in academia, the law and the military; a lack of typos is absolutely a strong indicator of good work product across the board. Attention to detail carries over in life.

But you’re right that AI may destroy the reliability of heuristics like the above.

Expand full comment
Jul 18, 2023Liked by Maya Bodnick

Give ChatGPT a year or two and it will solve this problem.

Expand full comment
Jul 18, 2023Liked by Maya Bodnick

These AI models are inscrutable. That is a major challenge Ezra highlighted a while back.

Expand full comment

It could also reflect that those particular professors/fields have poor taste in writing.

Expand full comment
founding

It’s not about taste in writing. Very few professors have a taste for freshman essays of any sort. It’s about what sort of freshman essay relieves the pain enough to get an A.

Expand full comment
Jul 18, 2023Liked by Maya Bodnick

This is a very good point, and I think this bias rears its ugly head not just in rational arguments, but moral ones. I'll cite the James Denmore memo. It's sexist garbage, but it was well written (not well *reasoned*, but well written) enough that huge portions of the population thought that his points weren't that bad and he continued to be invited to large media forums to present his views. Had he expressed those same views in the language that most people use, he'd have been much more thoroughly ostracized.

A more crass example is Trump's "grab em by the pussy" moment. There was so much emphasis on "locker room talk", and indeed the very fact that half of the country seemed to think "it was locker room talk" is a reasonable defense, as if the word "pussy" was the problem, and not the act he was admitting to, says a lot about how people interpret language. If Trump had said something with the same meaning but used different language ("I sometimes reach into their lady parts"), would the outrage have been lesser? I think it would have, and that's kind of astounding.

Expand full comment
founding

I actually think that if he had used nicer language the outrage would have been greater. It could no longer be dismissed as “locker room talk” but would be taken as indicative of his sober intentions.

Expand full comment

Wondering if y'all watched the same tape I did lol. He wasn't bragging like a guy in a locker room, he was pretty soberly describing his behavior.

Locker room talk is more like "DUDE this chick was SOOO hot for me and [insert graphic depiction of sexual acts]", not "Yeah, so what I usually do is I just grab 'em by the pussy".

Expand full comment
founding

I didn't actually watch it myself, and my guess is that a lot of people were, like me, reacting to the transcripts.

Expand full comment

Hetrodox academy did a review of the James Devore letter.

He was largely correct on the science

Expand full comment

Reasoning is, itself, not science, it is a tool used by scientists to conduct scientific research.

It is possible to use a set of facts to come to ridiculous conclusions about how one should act. It is also possible to come to completely wrong conclusions about the cause (or consequence) of a set of facts.

His letter is pure idiocy. Well written idiocy.

Expand full comment

"In conclusion, based on the meta-analyses we reviewed and the research on the Greater Male Variability Hypothesis, Damore is correct that there are “population level differences in distributions” of traits that are likely to be relevant for understanding gender gaps at Google and other tech firms. The differences are much larger and more consistent for traits related to interest and enjoyment, rather than ability"

"If our three conclusions are correct then Damore was drawing attention to empirical findings that seem to have been previously unknown or ignored at Google, and which might be helpful to the company as it tries to improve its diversity policies and outcomes"

Expand full comment

I don't think you read his whole letter. I'm not referring to whether those findings are correct, I am referring to the conclusions about what one should do about those facts. Did you take a look at James' suggestions? They are, again, sheer idiocy.

To begin with, whether or not there are population level differences in the distribution of traits in a general population is, at best, orthogonal to the question of whether they exist within the population of google employees. One has to assume that there are pretty big selection effects at work, no?

This kind of thinking is literally the very definition of stereotyping.

Expand full comment
Comment removed
Expand full comment

Given the rest of this very interview (how he refers to "not even waiting for permission", and such), and given, well, every piece of evidence we have about the type of person that Trump is, I think it's ridiculous to think he was being "crudely metaphorical" here.

The second half of your statement further betrays the issue. You think the *language* is problematic, or maybe the "crudeness", but not the behavior being described.

I've spent a lot of time in locker rooms (vastly more than Trump, I am pretty sure) and can assure you that although men exaggerate about a lot of stuff, pretty much no one brags about sexual assault, so it would be *really freaking weird* for someone to make up a sexual assault and brag about it.

Expand full comment
Jul 18, 2023Liked by Maya Bodnick

My blackpilled take on all this (as somebody with a PhD in a humanities field who TA’d classes at an elite but not quite Harvard-level university) is that “passing grades in humanities classes” even at elite universities is not— and may never have been— a good signal of genuine critical thinking, analytic writing, or textual research skill. Chat-GPT can get passing grades because we give passing grades to a lot of intellectually empty fluff.

At the individual level, the institutional and interpersonal incentives to give almost everyone at least a C are very strong (actually failing students comes with both administrative overhead and the knowledge that your decision will probably make that student’s life materially worse). Departments are also reluctant to raise standards because of the possible negative effect on enrollments.

Ultimately, this dynamic is really bad for the long-term health of these academic fields. Low standards make it impossible for departments to credibly certify that their students actually have their field’s core skills— and that their best students are genuinely excellent rather than just baseline competent. This hurts the disciplines’ prestige and their graduates’ labor market prospects. It also attracts lower-quality students to the major, reducing the quality of classroom discussion and engagement— limiting opportunities for the strongest students to deeply hone their skills.

Unfortunately, faculty in these disciplines are reluctant to identify this issue as a serious collective action problem— and certainly aren’t taking steps to solve it. LLMs might provide some much-needed impetus to change, but I’m not holding my breath.

Expand full comment
Jul 18, 2023·edited Jul 18, 2023

That seems broadly correct. There is a vicious cycle at work.

Expand full comment

Bad incentives. All the way down. Like malicious turtles stacked on top of each other.

Expand full comment

I'm going to mangle an excellent phrase I heard elsewhere, but it was something like

"The professors don't want to teach, the students don't want to learn, and the parents don't want to pay for low grades. Solve for the equilibrium."

Expand full comment
founding

Getting a C rather than an F basically does just mean you did the assignment. But getting an A usually requires actually making an argument. It’s possible that my experience here is related to why philosophy generally gives the lowest grades in the humanities, though I’ve always assumed it’s more than e ask them to write a very different kind of thing that what everyone else wants them to write.

Expand full comment

“Actually makes an argument” is kind of a low bar, though.

Sadly, a lot of students at selective schools have trouble clearing it even with quite a bit of feedback and assistance; articulating a thesis seems to be one of those things like variable assignment in computer science classes where a decent chunk of even above average IQ people have a weird cognitive block.

Expand full comment

It's almost as if a lot of higher education is about signaling rather than about imparting knowledge and skills.

Expand full comment

Even the human capital enhancement component of the education wage premium is impossible to capture without some sort of credible certification mechanism.

Expand full comment
Comment removed
Expand full comment

Very few historians actually believe that all perspectives/arguments are valid, but a lot of them aren’t really good at pushing on students in class without making the less confident ones shut down or disengage. (This is actually really hard, especially in the context of America’s conflict-averse cultural norms— you need to project a really strong combination of warmth, authority, and openness to pull it off.) I think a lot of instructors tend to err on the side of conflict aversion rather than the side of over-harshness— because they’re also conflict-averse Americans, because discussion sections where a lot of students fail to participate are an awful drag, and because students harshly punish instructors they consider cold, cruel, or arrogant in evaluations.

I imagine that the dynamic gets even more fraught in something as personal/sensitive as creative writing is.

Expand full comment
Comment removed
Expand full comment

Most instructors would probably agree with you about student evaluations— they’re widely hated.

Expand full comment
Jul 18, 2023Liked by Maya Bodnick

The in-person essay is the fix. A European style interview could also be fun.

Expand full comment
Jul 18, 2023Liked by Maya Bodnick

BLUE BOOKS BLUE BOOKS BLUE BOOKS!

Expand full comment
Jul 18, 2023Liked by Maya Bodnick

Oral exams in defense of the papers would be great, but is just time consuming.

Expand full comment
author

I would be so scared of oral exams even as a former debate. With that said, I think it's a good idea.

Expand full comment
Jul 18, 2023Liked by Maya Bodnick

We're moving to do (very basic) versions of these even for pre-undergrad students next year.

Expand full comment
Jul 18, 2023Liked by Maya Bodnick

I have always felt that "timed, but with generous amounts of time" is an underrated form of evaluation.

Expand full comment
Jul 18, 2023Liked by Maya Bodnick

What is a "European style interview"?

My idea is that in discussion sections, the TA randomly picks one student and tells him/her to summarize their paper, identify the strongest point and worst flaw in it, and what they had to cut out but most wanted to keep in. But don't tell them those questions before the class.

Oh, and summarize the argument of three of the citations.

If they fail this (and it should be obvious), they get an F for the paper and the class.

(I'd hope that picking just one (maybe two) of the students to undergo this grilling would both meet time constraints and be enough of a motivation for all the students not to baldly cheat.)

Expand full comment
founding

I assume “European style interview” means an oral exam or thesis defense type thing.

Expand full comment
author

Yup, this was my main conclusion.

Expand full comment
founding
Jul 18, 2023Liked by Maya Bodnick

First, terrific essay, Maya. Wow.

In-person proctored exams, potentially combined with PhD-type oral defense, can solve this looming problem in education. But that is gonna mean a VERY different university environment than the one we have now. More expensive and more exclusive, it won't be an education for the masses.

I'm less concerned about AI's effect on the "cerebral class" overall. Generating text is one part of those jobs (journalism mostly) but is not the major focus. Personal interactions, value judgements, crafting bespoke solutions and creatively helping your employer make money are all big parts of those careers and AI can't do those things. Yet.

Expand full comment
author

Thanks John! And I do agree that the human side of things is hard to replace; I think low-level writing work may be on the chopping block, though.

Expand full comment
Jul 18, 2023Liked by Maya Bodnick

The problem with exams more generally is that they help bullshitters and people who are good at blagging on the day. People who take their time, are more considered and methodical, these are the people who benefit from coursework. Of course blaggers don't really need an extra leg up because they always outperform in interviews later anyway.

Expand full comment
Jul 18, 2023Liked by Maya Bodnick

Harvard has in-person proctored exams now (and did in 2019 before GPT). It won’t force any important changes to cost or access

Expand full comment
founding
Jul 18, 2023Liked by Maya Bodnick

There are more colleges & universities than just Harvard.

Expand full comment
Jul 18, 2023Liked by Maya Bodnick

As did (giving myself an out in the case of COVID changes) practically every R1. Is your prediction that simply administering exams will become a central driver of class size reductions?

Expand full comment

Has in person examination fallen off a cliff in the last decade at downscale universities? I went to a very large community college and then Ohio State and all my exams in 2004-2009 were predominantly either practical or in class combinations of multiple choice and hand written blue books.

Expand full comment
founding

How many humanities classes did you take? Most humanities classes where you want to teach people to write have used at-home papers for ages, because you can’t see how someone writes in just a couple hours under pressure.

Expand full comment

You can use time-limited essays--I took a couple as a philosophy undergrad, and a ton as a law student (it's the most traditional law exam format). I think they measure horsepower and preparation pretty well, at least if you make them hard. But they don't really measure writing skill; good writing takes time, especially while you're learning. And for undergrads, a lot of the benefit of a humanities course is the writing versus the content.

Expand full comment

I mostly took journalism and political science classes after my gen ed years and most of those had both a paper and in person exams and were weighted toward the test.

In my journalism classes it was more the writing of publishing a newspaper than research papers and stuff.

Expand full comment

So, Maya, be honest - did you write this piece, or did ChatGPT?

Expand full comment
Jul 18, 2023Liked by Maya Bodnick

It's well-written, but it's also well-argued and well-premised--that's how I know a human did it.

Expand full comment
Jul 18, 2023Liked by Maya Bodnick

Also relatively concise, something that ChatGPT doesn’t seem to know how to do well (though I haven’t experimented with prompting it to do so)

Expand full comment
author

Thanks guys!

Expand full comment
author

Maybe one of those mediocre AI detectors could tell you...

Expand full comment
Jul 18, 2023Liked by Maya Bodnick

My initial guess is that it will indeed lead to law school style exams across the humanities…but the future has a habit of throwing slurves at us.

(The reality of the internet is that very smart students could always get away with plagiarism by simply rewriting…it still saves time on writer’s block issues.)

As for law, I’m senior enough that my income depends upon making the hard decisions. I’m a bit sanguine about what AI really means for brief and contract or commercial instrument writing — no one writes those from scratch anyway (the real reason why WordPerfect 5.1 was the favorite legal word processor is no one since then developing a word processor has understood how lawyers work.)

Expand full comment

I'm curious what other sorts of exams people have experienced in the humanities. Prior to law school I majored in history at a state flagship. All of our exams were written (by hand of course) in a big lecture hall under the eye of a proctor. The same general approach was followed in the English literature classes I took, and of course law school had a similar type of testing. The difference, if any, would be in the research papers (IIRC most history classes had a proctored mid term, a take home research paper/assignment of some kind, and a final, proctored exam).

Expand full comment

I had similar experiences and not just in humanities. All my intro math and science classes had in-class exams done by hand as were my macro/micro and many others. In fact, besides a pass/fail freshman seminar and a required physical education component, I can’t think of a single course where the *exams* weren’t done in class by hand or in a larger proctored session in an auditorium. Now, lots of other assignments were done at home, especially essays, and the weighting of the exams depended on the course or department.

Expand full comment
Comment removed
Expand full comment

That's really sad to hear and is a serious disservice to the students.

Expand full comment

What I’m getting at with legal drafting is that it’s already a combination of revising, editing and new content with each document. AI generating some of the content doesn’t change that.

Expand full comment

For younger grades, especially grades 4-8, I'm worried. There is absolutely nothing stopping them from doing all their take home written work via AI. Those kids are already more tech savvy than their parents, and can find a way to access anything that's available. The time, resources, and agility required for elementary and middle schools to prevent this themselves seems like it's not there and parents broadly do not have the ability either (some will, but as a whole, nope). And these are crucial ages for writing skills that are life skills, not academic ones. I fear we'll have many students that come into that age range proficient in reading and writing and then slowly drift downward over time before it's obvious they've been using AI and need remedial help.

I think the question about higher Ed is much less important than younger grades. For higher Ed, so much depends on whether AI can break into the realm of consistently great analysis rather than the mere good writing it produces today (both possibilities seem plausible to me). If it can do that, we're talking about a major breakthrough and the ability to measure college students grades is small potatoes in comparison. If it can't break through, colleges will tweak assessment methods or raise standards and it'll be fine.

Expand full comment
author

Great point

Expand full comment

Another path is they become a tool just like spellcheck. I can't spell but since I always type it's not a huge set-back. I could also see these tools becoming the norm for research, starting structure, proof-reading, grammar puncher-uppers. They would level-up the entire class.

Expand full comment

If writing skill is affected by AI in the same way that spellcheck has affected spelling, I think we will be worse off. Writing is much more generalizable than spelling and is a conduit for novel ideas and analysis in a way that spelling is not.

Expand full comment
Jul 18, 2023·edited Jul 18, 2023

I don't see how that would be the case. Professional writers have editors. Editors create value. A LLM powered "personal editor" would provide some approximate value.

Expand full comment

I think we're talking about different things here. For adults, I see powerful and useful writing AI as a clear good. It's the impact on the formation of writing skills in children that I think would be bad.

Expand full comment

Could be different things. I guess just don't see a problem if children are learning how to write in tandem with such an editor-like tool. I think it would have a rising-tides-lifts-all-boats effect. I do agree if children are just copy and pasting LLM outputs that's a problem. I think teachers will be able to solve for that though.

Expand full comment

yeah but the homework in grade 4-8 is pretty BS, isn't it? In our district regular homework doesn't really even start until 7th grade.

Expand full comment

Yes. Excellent points.

Expand full comment
Jul 18, 2023Liked by Maya Bodnick

At Oxford you’d eventually have to sit down with your don and they’d tear your paper apart in front of you while expecting you to defend every choice - both content and style.

Hard to scale but maybe a path forward since humanities enrollment is declining anyway.

Expand full comment

MIT does this with coding assignments - you have to explain your logic and choices to a TA.

Expand full comment

That's new, when did that start?

Expand full comment

The amazing thing is that this obviously far superior education has produced no apparent advantages for Oxbridge graduates over Ivy Leaguers.

Expand full comment

Hard to say. In economic terms I imagine you’re right but that’s presumably a byproduct of us vs uk. But if you consider the value of education to be purely about earning potential I think you’re missing out on a lot.

Expand full comment

Sure, but is there any metric at all where you can actually see the impact? “Great poems written” very much applies!

Expand full comment
Jul 18, 2023·edited Jul 18, 2023

With regards to advanced degrees - what I’m familiar with- the difference is tangible. I assume it can be seen in metrics such as years to PhD completion and success at job market (esp if you control for country in which you pursued consequent graduate degree). Other than that the honest answer is “I don’t know”, BUT I’d like to believe that having gone through some years when you were consistently challenged to think and work harder must make some long term difference at least to a fair number of people, if only with respect to their “inner”/mental life and probably for much else besides. But maybe not?

Expand full comment

I’d be very interested to see what’s tangible about that difference, since you’re familiar with it. I don’t see obvious signs of Oxbridge-educated scholars contributing more to arts and letters, let alone to science, than peers across the pond. It would be amazingly disappointing if all this effort largely went to just improving dinner party conversation and one’s internal intellectual pleasures.

Expand full comment
Jul 18, 2023·edited Jul 18, 2023

The tangible thing is, quite simply, that their undergraduate education is at a higher level so that they are much better prepared for grad school, in most respects, compared to their american peers. I speak of humanities but wouldn't be surprised if it's the same i the sciences. The upshot is being quicker to get a phd. Since American grad schools are typically longer and more thorough in teaching they more or less close that gap by the point of completion of the phd (vs the oxford dphil), but you "pay" for that in years spent getting there (then again American PhD programs are MUCH more generous in funding, so it again balances out or arguably favors the American side from the student's perspective, but not with respect to economic efficeincy of the system as a whole). However the very existence of the gap at the undergraduate level is suggestive with respect to the majority who don't go to grad school.

Expand full comment

P.S. and of course the difference is measurable and apparent for the admittedly minority who go on to pursue PhDs in their fields. Oxbridge people would typically have at least 2 years worth of time saved if that’s their goal.

Expand full comment

100% false in the T parts of STEM (as is partially observable looking at the composition of grad school classes)

Expand full comment

Yeah it depends more on where you go to grad school, if they end up back in the states they’ll still need to do the standard 4-6 years. PhDs here are measured by publications not by witty repartee or exams. On the continent they favor a shorter phd but people that want faculty positions typically need to do a postdoc or two after.

Expand full comment
Comment removed
Expand full comment
Jul 18, 2023·edited Jul 18, 2023

This is…not a great year to be talking about how well the British government functions compared to ours. One of their major parties was in fact even hijacked by a fat dumbass—but it didn’t even lose power after that, not to mention after failing to instigate essentially any recovery from the Great Recession!

Expand full comment
Jul 18, 2023·edited Jul 18, 2023

I’d also add that at Oxford your entire grade will be determined by the harrowing in-class by-hand (and still in-gown??) written exams at the very end of the degree.

Oxford was already miles ahead of top us schools in its academic rigor. I expect the gap to increase.

Expand full comment

Sub fusc is still required for the exams, which is worn with a commoners' gown or a scholars' gown. At the degree ceremony, you replace that gown with the gown and hood of your new status at the point where you are graduated to the new status.

Most UK universities have gone from 100% of the final mark being from Finals, to either 2:1:1 or 2:1:0 ratios (ie the final year counts as 2, the second year as 1 and the first year as either 1 or 0), and usually mark some fraction of the grade for each course through in-year assessment rather than pure examination, typically a third with two-thirds still being driven by exams at the end (and many have semesterised, meaning two exam sessions per academic year, one at the end of each semester).

Even in the lowest case, though, the final exams are likely to be at least a sixth of your degree, and if you include the first-semester of your final year, at least a third.

Expand full comment

Cambridge, as usual, has a completely different set of weird traditions different from Oxford. Cambridge graduates get two honours classifications, one for Part I and the other for Part II. Depending on the Tripos (subject), Part I may be just the first year, or may be the first two years; Part II may be any of years 2-3, 3, or 3-4. For bonus chaos, Natural Sciences has Part I (years 1-2), Part II (year 3) and an optional Part III (year 4); two-Part degrees are BA, three-Part degrees are MSci.

The upshot is that a Cambridge student in Natural Sciences can get a "triple first" - getting first-class honours in all three Parts of the Tripos, while students in most other subjects can only get a "double first". There are a few other Triposes with three Parts, giving a four-year degree with a master's.

All Parts of a Tripos at Cambridge are examined at the end of that Part; some two-year Parts are examined as IA and IB or IIA and IIB, with examinations at the end of each year, others are only examined at the end of the second year of the Part.

Expand full comment

As of 2007 they were still taken with the commoners gowns and scholars gowns. Today, I dunno.

Expand full comment

As of much more recently than that. Probably to this day.

Expand full comment
founding

When I was visiting Oxford in June students were headed to their exams in some sort of gown.

Expand full comment

As an engineering major a few years out from college, I think the humanities can learn a bit from STEM courses. Homework was important as a learning tool, but not for grades.

Profs would often give the answers for the problem sets ahead of time, but students had to come up with the work in between - I thought this was useful to de-emphasize the answers relative to the work for both students and instructors.

Instead, most of my upper-level grades came from intensive exams and extended group projects. I don't think the exams are necessarily relevant to work in industry, but they're a great way to assess what students know without much risk for BS. Proctoring was fairly light-touch owing to an honor code with good buy-in across the school.

Expand full comment
founding

yeah yeah, but does it get docked on the "personality" test when trying to get in? 👀

Expand full comment
Jul 18, 2023Liked by Maya Bodnick

Great first piece, Maya! A couple of thoughts from a grad student in the philosophy department:

As many have noted in the comments already, I think this is at least as indicative of the rampant grade inflation at Harvard as of chatGPT's abilities. But I wanted to explain a little more of *why* I think I've capitulated to the inflation and how it pushes back against your final thesis, that chatGPT's ability to do well on undergrad coursework indicates it will do well at the jobs that coursework is supposed to prepare you for.

So first off, I think its a big deal that these were freshman courses. I've TFed two big intro philosophy classes, and I keep the knowledge in the back of my mind that most of these students are not going to major in philosophy. The biggest things I want them to take away from their essays are:

1) being able to write legible, specific, concise prose (chatGPT is full of the flowery imprecise language that flies in other humanities classrooms, but not in philosophy. I am constantly training my students out of bad habits here.)

2) having an interlocutor in mind and actually thinking carefully and generously about what it would take to convince them of a thesis they come into the paper disagreeing with. (As far as I've seen, chatGPT doesn't do very well on this more substantive front, and I kind of doubt it will get much better, because *most* writing you'll find in the world doesn't really do well on this front, even among philosophers.)

That's not to say chatGPT wouldn't do well on my freshman essays; it writes a *lot* like my freshman at the beginning of the course. It would be suspicious to me if the writing contained the same flaws with no improvement towards the end of the semester, but I'd probably assume it was a lazy student who didn't bother to read my feedback, and give them a B+ at worst.

But here's the thing: I *know* most of my students aren't going to be philosophers, particularly the ones who don't bother to respond to my feedback. And it's just not worth tanking my teaching evaluations to give mediocre grades to mediocre students in a freshman course. On the other hand, if my juniors were turning in chatGPT-level work, I'd sit them down and have a serious chat, because presumably by that point they are seriously considering a career in philosophy, and that standard simply won't cut it.

So the tl;dr here, I think, is that what flies for a freshman essay is at least in my discipline not indicative of the ability to succeed in the major, *much less* in the field professionally. Maybe this is less true of fields where you have job options besides academia, but I think that means the humanities at least are relatively safe.

Expand full comment
author

Thanks for your thoughtful comment. Super interesting!

Expand full comment
Jul 18, 2023·edited Jul 18, 2023Liked by Maya Bodnick

>> chatGPT is full of the flowery imprecise language that flies in other humanities classrooms, but not in philosophy. I am constantly training my students out of bad habits here.)

I’m in history/classics and feel you with every word. So what are those “other classrooms” and are they even in the Humanities ? Do they really exists that often? And should we even assume that they do, given that we’re talking mostly about freshmen? I think students come with bad habits from HS (or rather in HS they mastered more basic skills) and in college we try to bring them to the next level as writers and thinkers. Learning that clear and precise language is much better than flowery one is a classic example.

Expand full comment
Jul 18, 2023·edited Jul 18, 2023Liked by Maya Bodnick

Fair point! I think philosophers like to think of ourselves as the last rigor remaining in the humanities and have grossly inflated notions of what our field demands relative to others. I haven't actually read professional work in history so it's unfair of me to paint with such a broad brush. That said, I have read a lot of work in urban studies and sociology, and I do think the use of language that a philosopher would call "imprecise" is pretty widespread. But it might be that the kind of clarity sought in philosophy just isn't necessary for the goals of other fields. We should probably get off our high horses about this.

Expand full comment

It’s a bit of both. However I don’t think either sociology or urban studies (about which I know little and close to nothing respectively) are typically considered “humanities”? When I hear humanities I think of: history, philosophy, classics, comp lit, linguistics , no?

Expand full comment
founding

I would think English, history, philosophy, languages (both modern and ancient) as well as most “area studies”. I would think of linguistics, anthropology, and sociology as social sciences, along with economics and psychology. I’m not sure whether urban studies would go with the humanities or social sciences. It might be like cultural anthropology in many ways, which straddles the line.

Expand full comment
Jul 18, 2023·edited Jul 18, 2023

As another Ivy Ph.D. Student in the humanities (Columbia, in my case), I'd say that the source of a huge amount of this imprecision is ironically philosophy itself, albeit of a different sort rarely taught in actual philosophy departments (as far as I know!): the continental tradition. I'm thinking especially of the thought of various po-mo authors/philosophers that have become hegemonically influential in the humanities (e.g., Foucault, Derrida, Barthes, etc.). I don't think this stuff really has much presence at the undergrad level, but if we're talking about published work, it's everywhere, especially in Sociology (not a humanistic discipline IMO, but still influential in the humanities through figures like Bruno Latour, another po-mo-ist).

Expand full comment
RemovedJul 18, 2023·edited Jul 18, 2023
Comment removed
Expand full comment

"I think part of the problem is that quite a few subjects do not clearly define where they intend to place themselves among these possibilities [for emphasis on data]."

Many (most?) academic fields define themselves by the problems studied, not the methods used to attack them.

Expand full comment
Comment removed
Expand full comment
Jul 19, 2023·edited Jul 19, 2023

No, you've missed my point. I agree: for each problem, there is a maximally effective method to tackle it. Because a discipline rewards practitioners for solving problems, those effective methods will become characteristic of the discipline. But that does not mean that the discipline *defines itself* by those methods. If a new and better method is invented, the discipline changes: it becomes the people who study the *old problem* with the *new method*, not the people who study a *different problem* for which the *old method* is still effective.

Consequently disciplines do not feel the need to "police" their methodologies. If it produces results, then fantastic: they now have a new eye with which to triangulate. And if not, then that's one fewer competitor for tenure.

This is true for your very examples. Historically, chemistry has involved tremendous work in purification (actually, one might say that purification and separation is *still* the heart of the discipline) and very little atomic physics. As modern spectroscopy has become relevant for chemical elucidation, chemists have started using spectroscopy. But we still call them chemists! Likewise, modern chemical syntheses eschew traditional reagent combinations, instead using enzymes isolated from cell lysates. But we don't call their engineers biologists.

Conversely, physicists no longer use water clocks nor pendula nor even oscillating circuits to measure time; instead, a metrologist will use spectroscopic transition frequencies, the selfsame technique I mentioned a paragraph ago. But that doesn't make them a chemist, because they using it to a *different end* (viz., measuring time).

The same is true for the mathematization of economics: Adam Smith's literary works are very different from Paul Samuelson's mathematical models, but they're part of the same discipline because they both study welfare and transactions. Likewise, Smith, Marx, and James all performed similar work, but one is an economist, another a sociologist, and the third a psychologist. (Similarly: was Pavlov a psychologist or a biologist? Why?)

You would be on stronger ground if you pointed to the rise of "interdisciplinary fields" like biochemistry (biology with chemical techniques), geophysics (geology with physical techniques), or neuroscience (psychology with mathematical/physical techniques). But here, too, I think one could make the argument that these disciplines "speciated" because they ended up stuck on problems quite different from their parent discipline(s), not because they ended up using different techniques.

The one possible exception I can think of is the extent to which modern economics encroaches on sociology: both study trustworthiness, organization, and learning, but a sociologist generally eschews mathematical models and an economist prefers them. But the rise of "intinerant economists" is a relatively new phenomenon, and I think it's probably too early to characterize its disciplinary effects.

Expand full comment

Agree with this - also Harvard admits plenty of people with mediocre HS English instruction who will do great things in not-humanities. Turning required breadth requirements into a B- filled slog isn’t really in anyone’s interest.

Expand full comment

This reminds me of the hardest course I took in college. Maybe not in an absolute sense, but at least in the sense that I was totally unprepared and had to actually overhaul my study habits to pass.

It was French 202. Shouldn't have been anything particularly difficult. The problem was that my small liberal arts college only had one French professor, and the one course I needed was only taught in the spring, and was at the same time as something I needed for my major. So I ended up needing to commute to a nearby larger school to get my credit.

The problem of course is that the school I went to did not require ANY foreign language credits to graduate. So rather than taking a class with some other kids who wanted to learn a bit about French culture and maybe passably order food in a restaurant while on vacation, I was in a class with 20 students who intended to become French teachers or professional translators. After getting like a 20% on the first test or something, I went to the professor and explained my situation. To his credit, he said he wasn't going to give me anything for free but was willing to work with me and answer questions after class, and I think I managed a B- by the end.

This is long-winded but my point is that I agree that there's just an enormous difference between a class that's intended to expose students to something new and ask them to think about it and maybe develop a few baseline skills, and something intended for future professionals.

Expand full comment
founding

"And it's just not worth tanking my teaching evaluations to give mediocre grades to mediocre students in a freshman course."

I think you are doing your students a disservice by withholding honest feedback, merely because some of them will tank your evaluations. It is a bad incentive by your employer, to be sure. But it seems pretty selfish to me.

Expand full comment

I didn't explain the ins and outs of my grading system in the above comment, but I take this very seriously! I write extremely extensive sets of comments on all of my students' papers and give them a clear sense of where i think their writing could be stronger. I've also recently been trying out a system where I give two grades: a "real" grade reflective of some objective standard I think they ought to be able to reach by this point in their education, as well as a "Harvard" grade, adjusted for inflation. So far students seem to be receptive!

(Funny enough, I recently found out Harvey Mansfield, one of Harvard's last remaining conservative professors that just retired, has been using the same system for decades.)

Expand full comment
founding

Kudos for trying to walk the line between the grade inflation the administration appears to want and the honest feedback the students deserve.

Expand full comment

So this is an interesting article and it’s on a theme I’ve heard a lot about (academic cheating) but it’s totally perpendicular to my personal experience.

I live in Australia and recently graduated from a law degree (it’s undergraduate here). I tried out Chat GPT to see what it could do and it was almost entirely useless. There is no way it would’ve been able to get even a passing grade on any legal essay or take home exam I’ve done. It seems the need for highly specific citations for all claims, synthesis of highly complex ideas and the diversity of sources of law were well beyond what Chat GPT could do.

That being said, it’s definitely improved over the last 18 months and maybe some day it’ll be able to do it - but it is well well off as of now.

Which is why I am very skeptical AI is going to be replacing very many lawyers in the near future (expect maybe improving doc review automation?)

It seems like the key difference here is that most of these essays/take-homes seem very high schooly to me. Again might be due to difference in how University works in different countries - but I’ve never received course work that was this easy - even in 1st year. Is this typical?

Expand full comment
Jul 18, 2023·edited Jul 18, 2023

Anyone who follows my comments here knows I’m highly critical of undergraduate academic rigor at Harvard (and its sisters). However, I admit I too was kind of suprised by most (not all) of these essay prompts, which indeed seemed “high schoolish” and unfitting. Also way too generic.

Expand full comment
Jul 18, 2023Liked by Maya Bodnick

Yeah coming from the Phil department I now understand why my undergrads are so baffled by our prompts!

Expand full comment
Jul 18, 2023·edited Jul 18, 2023

If you make sure to show them you’re listening- listening, but not bending over backwards- they will appreciate it and if you work hard in planning interesting and challenging classes they will appreciate it and will likely give you good evaluations. You should be demanding but absolutely clear on how they can attain that A. It could be a high bar it needs to be a clear one, and many of them will do the necessary work. Remember that at least 80% of them (probably more now post covid) never encountered actually demnading rigorous academic work. It’s your job to remedy that and they will appreciate it after the initial shock.

Expand full comment
Jul 18, 2023·edited Jul 18, 2023Liked by Maya Bodnick

"I tried out Chat GPT to see what it could do and it was almost entirely useless. There is no way it would’ve been able to get even a passing grade on any legal essay or take home exam I’ve done."

I'm an American patent attorney, and I have had similar experiences trying to get ChatGPT to do patent work. It doesn't understand very core concepts of patent law, and it lies constantly.

That said, I suspect it would get better at any area of law with some specific training and access to various legal databases.

EDIT: To clarify, this was with a paid account and GPT-4.

Expand full comment

Lexis and West are all over this I’m sure. I think we’ll have a good working model within 18 months.

Expand full comment

West is in the process of buying Casetext, which develops AI legal research tools. I use Casetext a lot and think it's excellent, but it's built for lawyers to use, not as a replacement for a lawyer. It would be no more useful to a layperson than Westlaw. And while Westlaw is far more powerful than old hard-copy methods of legal research, it requires a lot of underlying professional expertise to understand how to use it and what it's spitting out.

Expand full comment
Jul 18, 2023Liked by Maya Bodnick

I’ve noticed something similar while experimenting with the tool at my workplace— it can generate work-useful summaries of text, but quite a bit of prompt engineering is necessary to get those summaries focused/relevant, and it really can’t move from summary to making good inferences or judgments.

Expand full comment
Jul 18, 2023Liked by Maya Bodnick

There’s an enormous difference in the free version (3.5) and ChatGPT 4 for what it’s worth.

Expand full comment

At the small private liberal arts school I attended about a decade ago, the professors never assigned any written work that didn’t require citations. One history professor required us to write an introduction and thesis every week for a possible 3 page paper, and then printed off the best one and distributed it for feedback and criticism in front of the class, and that was only in a 200-level class. Others required us to attend office hours with our outlines of papers and talk through our arguments before writing. Harvard just seems lazy, not just easy? But of course Harvard professors are busy writing New Yorker articles and Substacks and can’t be bothered with students...

Expand full comment
author

You need citations in Harvard classes, I just didn't worry about this for the experiment -- I'm sure you could easily add them manually to a GPT-written essay and eventually the AI will prob be able to do it

Expand full comment

It’s been a while but these seemed like pretty “friendly” topics to me. That is to say, weak and unrealistic.

Expand full comment
Jul 18, 2023·edited Jul 18, 2023

Yeah, I think this is a much better proof of the thesis that Harvard is assigning a lot of bullshit work than that most low-level legal work is imminently going to be automated away. It would have helped a lot to see the full essay prompts and responses; Bryan Caplan had it get an A on an exam for one of his courses but the exam itself was just a bunch of requests to verbally reiterate Bryan Caplan’s opinions on things. I still haven’t seen it doing a convincing job on serious work.

Anyway, being allowed to skip citations is a lot like saying it did well on a math exam where it had to state the right theorems to use on each problem but not actually apply them to the given quantities.

Expand full comment
Jul 18, 2023Liked by Maya Bodnick

My thought on writing papers is that, besides moving to in-class essays, professors in non-creative writing style classes could place a lot more emphasis on including footnotes/endnotes done in legal writing style -- i.e., every statement of fact needs to be "pin cited" to an exact page in the source, preferably with a quotation in the text or a parenthetical quotation after the cite -- and then, instead of focusing mostly on the body of the text, the professor primarily focuses on checking a random sample of the citations to make sure they actually say what they purport to say. This isn't very glamorous, but, given that ChatGPT 4 reportedly still has issues with "hallucinating" sources, it seems like it's still the best non-automated way to see if work product is human generated.

Expand full comment

While I think this might still be a good idea regardless of AI - this is a good example of how hard it is to design an AI-resilient system while the tech is still developing.

A couple months ago it was true that GPT4 was really, really bad at appropriately citing real sources. I wouldn't say it's great now, but it's much much better now that plugins have been introduced. The ScholarAI plugin gives it access to a bunch of academic databases and can save everything to 3rd party citation managers. It's still nowhere close to perfect. I wouldn't trust it for real research. But for something like a freshman sociology paper, it's citation abilities have gone from "this will probably get you caught and kicked out" to "double check before you turn anything in, but it's probably good".

Expand full comment

So the plugin makes another pass on the internet, finds all the quotations (or something close) and then cites them? Since the GPTs apparently don't retain their training sources, what could possibly go wrong ...

Expand full comment
Jul 18, 2023Liked by Maya Bodnick

As a former (science) TA, I started thinking about grading humanities essays and shuddered. ChatGPT gets my rubber stamp B+.

Still spouts complete nonsense when you ask it a science question though.

Expand full comment

I don’t think this is entirely true. It says intelligent / useful things about gene pathways eg

Expand full comment

A bit of a hyperbole on my part. What I've noticed is that it says a lot of things that are *partially* true and says enough jargon to make it sound like someone who knows what they're talking about. But anyone with some expertise in the subject would recognize the holes. If it were a human, you'd call it out as bullshitting. :)

Expand full comment