AI News - Precision Talent

admin
Jun, Mon, 2026

AI News

Judge Rules Blacked.com Can Sue Meta for Scraping Its Porn

A federal judge has rejected Meta’s attempt to dismiss a lawsuit from Strike 3 Holdings, the company that owns popular sites like Blacked, Vixen, and Tushy, for scraping its porn videos.

The decision shows Meta’s nonsensical justification for scraping massive amounts of copyrighted material from the internet in order to train its AI models, and is notable for adult content creators, who have been scraped for model training data long before the current generative AI boom.

Strike 3 Holding first filed its lawsuit almost a year ago after internal Meta emails revealed in a different lawsuit showed that the company downloaded over 81 terabytes of data by scraping Anna’s Archive, a massive open search search engine for torrenting copyrighted material including books, movies, TV shows, and porn. A Strike 3 Holding investigation found that 47 IP addresses belonging to Meta were used to torrent 2,396 of its videos a total of 6,008 times between 2018 and 2025. On Thursday, Judge of the United States District Court for the Northern District of California Judge Eumi K. Lee rejected Meta’s attempt to dismiss the lawsuit, allowing it to move forward.

Meta argued that Strike 3 Holdings failed to show that Meta actually intended to use Strike 3 Holdings’ videos to train its AI models and that Meta, the company, was actually responsible for downloading the videos, as opposed to rogue employees downloading porn on company time from company IP addresses.

According to the judge’s ruling, Strike 3 Holdings’ investigation showed coordination across Meta’s IP addresses that proved “a coordinated effort to gather data,” as opposed to the action of random employees. Specifically, Strike 3 Holdings showed that Meta’s IP addresses torrented files with similar file names on the same day, ranging from porn to cartoons and sitcoms, suggesting the company was downloading files based on key terms.

“For example, IP Ranges A and F torrented the following files on December 15, 2022: ‘Teen Sex Sessions 2 (2012),’ ‘Teen Titans Go to the Movies (2018),’ ‘Teens Love Tats XXX,’ ‘TeensLoveAnal.16.09.30.Amara,’ ‘Teenfidelity Pics,’ ‘TeensLoveAnal.16.06.10.Casey,’ ‘Teenage Mutant Ninja Turtles (1987-1996),’ ‘Teen Mom Girls Night In S02E08,’ ‘TeenyTaboo.22.12.07.Kiana,’ and ‘TeenageDelinquents.Maryjane,’” the decision says. “On the same day, a Corporate IP Address was used to torrent ‘TeenCurves.22.12.09.Willow.’ The connection between these files is plain: The word ‘teen’ appears in every file name.”

The judge said that Meta suggesting that its IP addresses downloading all these files at the same time was the work of different individual Meta employees acting independently “strains credulity.”

The judge also explained that whether Meta actually used Strike 3 Holdings’ videos to train its AI models is irrelevant because Meta violated Strike 3 Holdings’s copyright when it torrented its videos. It illegally downloaded the files and also “seeded” them, meaning they distributed the pirated to other users.

“In sum, Plaintiffs [Strike 3 Holdings] have plausibly alleged that Defendant [Meta] is liable for direct, vicarious, and contributory copyright infringement based on the torrenting of their films,” the decision said. “Defendant’s motion to dismiss is therefore DENIED.”

admin
Jun, Mon, 2026

AI News

Introducing the OpenAI Partner Network

OpenAI launches the Partner Network, investing $150M to help global partners accelerate enterprise AI adoption, deployment, and transformation.

admin
Jun, Sun, 2026

AI News

Scientists Discover Vast Ancient ‘Necropolis’ Teeming With Strange New Creatures

Welcome back to the Abstract! Here are the studies this week that died in the deep, let nature call, tossed a galactic salad, and became interstellar voyeurs.

First, there’s a whale necropolis under the sea that is packed with ancient carcasses and teeming with new species. Then: a bygone world preserved in poop, the fruits of the universe’s labor, and a zoom lens for distant planets.

As always, for more of my work, check out my book First Contact: The Story of Our Obsession with Aliens, or subscribe to my personal newsletter the BeX Files.

A visit to the cetacean cemetery

Peng, Xiaotong; Zhou, Peng; Song, Xikun; Bianucci, Giovanni; Du. Mengran et al. “A 5.3-million-year-old deep-sea whale necropolis in the Diamantina Zone.” Nature.

Scientists have discovered an unprecedented underwater “necropolis” that contains the remains of hundreds of whales that died over the past five million years, scattered across 745 miles.

During dives in a deep sea submersible, researchers spotted whale bones submerged under more than four miles of the Diamantina Zone in the Indian Ocean, making this site the geographically largest, deepest, and oldest whale necropolis ever found. The graveyard is also teeming with species that may be “new to science” and subsist on these fortuitous “whale falls,” according to a new study.

“The discovery of whale-fall communities in the Diamantina Zone at depths exceeding 6,700 meters establishes one of the deepest known whale-fall ecosystems in the ocean, extending the known depth range of such habitats by more than 2,500 meters,” said researchers co-led by Xiaotong Peng of China’s Institute of Deep-sea Science and Engineering.

“This area has a deep and extensive accumulation comprising five modern natural whale-fall communities and 476 fossil cetaceans recorded,” the team said.

Peng and his colleagues first spotted the necropolis during dives in early 2023 using the Fendouzhe submersible, which is capable of bringing crews to depths of nearly seven miles. The team quickly realized they had tapped into a scientific motherlode, complete with an immense fossil archive of extinct animals—mostly deep-diving beaked whales—along with recent whale falls that still support thriving ecosystems of crustaceans, molluscs, worms, and microbes.

“Bone-eating worms, gastropods, vesicomyid bivalves and brittle stars dominate the megafauna (more than several centimetres in size), reaching local densities up to 2,840 individuals per square metre,” the team said. “Most recovered taxa may be new to science.”

As for why this vast necropolis formed, beaked whales may be attracted to these deep waters due to the abundance of prey sources, such as squid and fish. Some might accidentally dive so deep that they experience decompression sickness or fatal exhaustion, becoming bonus bodies for seafloor ecosystems. The sinking carcasses are then funnelled into the Diamantina Zone because of its V-shaped topography, serving up a figurative feast for scientists (and a literal one for marine biota).

“As beaked whales are known primarily from rare strandings, their abundance, distribution and ecology remain poorly understood overall,” Peng and his colleagues concluded. “Our discovery of an accumulation of skeletal remains…provides an unparalleled source of information on these largely enigmatic cetaceans.”

Mariners have long dreaded ending up in Davy Jones’ locker, the proverbial resting ground of drowned sailors. It turns out that whales have a whole locker room down in the deep, where the bodies of countless leviathans blossom into fleeting hotspots of life.

In other news…

Bathroom blast from the past

Murchie, Tyler J. et al. “Ground squirrel coprolites preserve complex archives of ancient environmental DNA over 700,000 years.” Nature Communications.

The Klondike region of Canada’s Yukon territory is famous for the 19th-century gold rush that led hopeful prospectors to riches, ruin, and early graves. But now, scientists have found a very different type of valuable nugget in Klondike soil—ancient squirrel poops made by ancient squirrel bums as early as 700,000 years ago.

Scientists sequenced ancient environmental DNA (aeDNA) from these permafrosted scats, thereby opening up a poopy portal into the past. The fossilized feces, known as coprolites, contained genetic traces of mammoth, saber-tooth cat, horse, and bison, suggesting that these Ice Age rodents may have gnawed on the corpses of much larger megafauna. The coprolites also preserved DNA from hundreds of plant species, several insects, and a bevy of microbial and fungal strains.

“The diversity and abundance of aeDNA recovered from the permafrost preserved, ground squirrel coprolites presented here underscores the immense value of Arctic rodent middens as repositories of Quaternary ecosystems,” said researchers led by Tyler J. Murchie of the Hakai Institute and McMaster University. “The ecological and evolutionary power of coprolites would appear to exceed that of both bone and sediment.”

As a bonus, the team refers to the rodent behind each coprolite as the “defecator,” in case anyone is seeking inspiration for a disgusting superhero concept.

Eat your galactic green peas

Gupta, Maitrayee et al. “Blueberry and Green Pea galaxies live in low-density environments.” Astronomy & Astrophysics.

The fruits of summer gardens are beginning to ripen here on Earth, but what about the pea patches and berry bushes of outer space? In a new study, astronomers examine a sampling of so-called “Green Pea” and “Blueberry” galaxies, which are small and compact systems that have extremely high star formation (“starburst”) rates.

Named for their green and blue hues, these starry objects are thought by some scientists to be similar to the first galaxies that lit up the universe during the epoch of reionization more than 13 billion years ago, making them useful analogues of primordial galactic evolution.

“Within the diverse tapestry of galaxy populations, Green Pea and Blueberry galaxies represent particularly intriguing classes,” said researchers led by Maitrayee Gupta of the Astronomical Institute of the Czech Academy of Sciences. The galaxies “present an opportunity to gain a unique perspective” on the processes “driving cosmic reionisation,” the team added.

To that end, Gupta and her colleagues observed a selection of these galaxies and found that they “predominantly reside in isolated, low-density environments” which means that their intense starbursts are not driven by interactions with galactic neighbors, such as mergers. Instead, the team concluded that these recent starbursts are driven by internal processes, “reinforcing their role as nearby analogues of young, low-mass galaxies in the early Universe.”

If you’d like a more substantive galactic meal than peas and blueberries, may I recommend the Fried Egg Galaxy or the Hamburger Galaxy? Cap it off with a Milky Way for dessert.

Journey to the magic spyglass in space

Palos, Mario F. et al. “Curved Space Telescope: E-sail concept to the solar gravitational lens focal region.” Advances in Space Research.

There is a sweet spot in the outer wilds of the solar system, about 650 times the distance between Earth and the Sun, where it is theoretically possible to peer across interstellar space and spot surface features of exoplanets—including continents, oceans, or perhaps signs of life.

This phenomenon, known as the solar gravitational lens, is caused by the Sun’s gravity warping light from distant sources, essentially making it a stellar magnifying glass. It could be an incredible observational tool, but schlepping all the way out into the solar sticks is a huge challenge that has inspired a host of futuristic spaceflight concepts.

Now, scientists have proposed sending “an E-sail propelled spacecraft” called the Curved Space Telescope (CST) powered by the solar wind, a stream of charged particles emitted by the Sun. The probe would cruise through the solar system by deploying metallic tethers that tap into the solar wind and generate thrust from repulsion effects with its particles.

“One of the most interesting scientific objectives for a mission like CST would be the search for proof of extraterrestrial life,” said researchers led by Mario F. Palos of the University of Tartu. The team added that risky maneuvers, like slingshotting close to the Sun, would not be necessary for this mission, unlike previous proposals along these lines.

E-sails have never been tested in space and it’s anyone’s guess whether we’ll ever be able to send a mission to this interesting frontier. Still, it’s amazing to think about capturing close-ups of aliens on faraway exoplanets through a starry lens.

Thanks for reading! See you next week.

admin
Jun, Fri, 2026

AI News

New OpenAI Academy courses for the next era of work

OpenAI introduces three Academy courses that help people build practical AI skills, create repeatable workflows, and apply agents in everyday work.

admin
Jun, Fri, 2026

AI News

‘You Will Not Speak on Flock Tonight’: County Commissioner Refuses to Let Residents Opposing Flock Speak at Meeting

A County Commissioner in North Carolina refused to let dozens of residents speak opposing Flock surveillance at a public meeting this week, instead forcing the group to designate one single spokesperson.

“How many people are here for public comment dealing with license plate readers AKA Flock?,” Michael Garrison, the chairman of the Madison County Board of Commissioners began the public meeting by saying. Nearly everyone in the audience’s hand went up. “Probably most everybody. Per our county policy, I’m going to respectfully ask that you guys take a few minutes to converse with each other, designate one person to speak … we’ll move forward with only one person, whoever that happens to be.”

“What? No. We all want to speak on this,” someone in the crowd said; others can be heard trying to object as well.

“You will not speak on Flock tonight,” he responds. “One person designated. You can pick that person … if I gave everyone three minutes to say the same thing, which is opposition to Flock, we’d never get done … I’ve spoken. I’m not debating this. I am taking advantage of our policy as it is written to streamline this process, you can either do it or not.”

“You’re in a room full of people who care!,” a person in the crowd says.

“We’re not going to engage in this back-and-forth conversation,” he responds. “We’re going to allow one person. Pick a person or not.”

0:00

/1:50

The Madison County Sheriff’s Office has been using Flock’s automated license plate readers, which scan and analyze the time and location of cars as they drive by, since at least March, according to a Facebook post by the Sheriff’s Office. Records compiled by HaveIBeenFlocked.com based on public records requests show that the Sheriff’s Office searches Flock hundreds of times per month. Over the last year, citizen privacy groups have successfully pressured their local governments into ending contracts with Flock. But in some cities and municipalities, residents feel like their concerns have been ignored.

“The Sheriff Office claims they are only using this technology for serious crimes, yet published audit logs tell a different story,” a website called Madison for Privacy says. “Madison County has searched the nationwide database over 1,200 times over just a 60 day period. In a county over only 20,000 residents, its hard to understand what could warrant this many searches.”

Members of the audience and several of the commissioners then argued back and forth. The commissioners said that the citizens constituted a “group” who all had the same position, and therefore could only select one representative to speak for seven minutes, which the board said was longer than the three minutes each person would normally be allowed to speak for. Residents argued that they were not a “group” but were there to give different perspectives on the issue and that they were concerned about the surveillance as specific individuals: “I’m not here as a group, I’m an individual,” one person says.

“I’m not here to argue with you,” a commissioner responds.

“So you’re going to decide to not listen to your citizens, that’s what you’re saying,” a woman in the crowd says.

“We’re going to follow the policy,” the commissioner responds.

“Can we request that there be a special meeting,” about Flock, a resident says.

“If you want a special meeting, you go back to the 250 years that the sheriff has been the elected official in the state of North Carolina and you have that meeting with him. This board, we don’t own Flock cameras, I’ve emailed some of you this. We don’t pay for Flock cameras. We don’t operate Flock cameras. We have no interest in Flock camera or Flock camera discussion. That’s your elected sheriff. So if you want to have a meeting with the person that’s involved with that, then you’ll have a meeting with [him], not with us that’s a legislative body. We don’t control the sheriff’s budget. We give him X number of dollars, he does with it what he wishes. I’m not having this discussion. Either you select a person or not.”

One of the residents suggests that the board of commissioners could pass an ordinance about Flock cameras; he is cut off by Garrison, who says again that the residents can pick a person to speak or not. Eventually, the residents do select one representative, who was allowed to speak for seven minutes.

Garrison’s argument is that the Board of Commissioners gives the Sheriff’s Office a budget, and that the Sheriff can spend the money on whatever it wants to. He suggested that the board therefore does not have oversight of what surveillance technology police are buying or what they are using it for. This fact highlights a problem many communities around the country are facing: Cities and counties are sometimes buying Flock surveillance technology without any transparency, with no public process, and with very little oversight. Citizens around the country have also felt like their elected officials are not listening to their concerns about surveillance.

It is common practice at city council and county council meetings to allow all residents who have shown up to speak provide public comment, which is one of the reasons that these types of meetings are often many hours long. At the Madison County meeting, these residents were not allowed to speak, which is much different than the practices we’ve seen at other, similar meetings.

Later in the meeting, another resident explains that their public records requests for details about the Sheriff’s Office contracts and use of Flock have not been sufficiently responded to. She was allowed to speak because she was providing comment about her requests for public records, and not Flock specifically. “I’m here to talk about the lack of government transparency and accountability that I’ve seen come up with the Flock issue, starting with tonight. I think that it’s disgraceful the way you are refusing to let citizens speak to their elected officials,” she said. “We’ve repeatedly asked you to hold a public meeting for us to discuss this, so I’m very disappointed to see a lack of transparency.”

The Madison County Board of Commissioners and Madison County Sheriff’s Office did not respond to a request for comment.

admin
Jun, Fri, 2026

AI News

Research into how AI can help users understand skin conditions

Health & Bioscience

admin
Jun, Fri, 2026

AI News

How Preply combines AI and human tutors to personalize learning

Preply uses OpenAI to launch AI-generated lesson summaries, providing personalised feedback and language learning exercises.

admin
Jun, Fri, 2026

AI News

The benchmark gap, explained: What AI leaderboards measure and what they miss

Somewhere out there, a model changelog is promising “significant reasoning improvements.” And somewhere else, an engineering team is staring at a production incident that the benchmark scores completely missed.

The benchmark gap, explained: What AI leaderboards measure and what they miss

These two things are related.

Every frontier model now scores above 88% on MMLU. GPT-5.3 Codex sits at 93%.

At that ceiling, score differences between models are statistical noise, and the benchmark that defined AI progress for years has become functionally useless for comparing top-tier systems.

Research published in late 2025 found a 37% gap between lab benchmark scores and real-world deployment performance for enterprise agentic AI systems.

Production had other ideas…

💡

This is benchmark theater: evaluation performed as spectacle, with the substance stripped out. If you have ever watched a model ace every eval you threw at it and then hallucinate its way through a production workflow on day one, you already know exactly what this article is about.

Pull up a chair and let’s begin…

How benchmarks became a leaderboard sport

The origin story

The original purpose of benchmarks like MMLU, GSM8K, and HumanEval was genuinely reasonable. Standardized tests let researchers compare models across institutions, track progress over time, and surface capability gaps.

Good stuff.

The problem arrived when benchmark scores became the primary currency for model marketing, at which point “measuring capability” became “winning the leaderboard.”

Where the incentives went wrong

Once scores started driving funding decisions, press coverage, and enterprise procurement, the incentive to optimize for the test rather than underlying capability became structurally inevitable.

Labs are staffed with brilliant researchers who understand exactly which training decisions move benchmark numbers. Some of that optimization reflects genuine improvement.

Some of it is, if we are being honest, just very well-compensated teaching to the test.

The contamination problem runs deeper than most teams realize

Data contamination is the most documented failure mode in benchmark evaluation, and also the most politely ignored one. LLMs are trained on web-scale corpora, and those corpora routinely include benchmark questions, answer keys, and worked solutions.

Claude responded

Empirical audits have found contamination levels ranging from 1% to 45% across popular QA benchmarks, with rates growing as benchmarks age. Turns out the internet is a terrible place to keep your test answers private.

Why mitigation strategies fall short

The standard fixes are less effective than assumed:

Paraphrasing questions provides minimal protection: research at ACL 2025 found LLMs often circumvent these transformations because they have already been trained on the obfuscated formats
Translation and context tweaks face the same problem: a model that has seen a paraphrased version of a GSM8K problem during pretraining is still a contaminated model. Just a more devious one
N-gram overlap and hash-based matching catch the obvious cases, but semantic similarity and cross-lingual leakage are substantially harder to detect at scale

💡

The deeper issue is that training corpora are so large that labs themselves have limited certainty about what is inside them. Nobody loves admitting that, but there it is.

What the numbers actually measure

Here is what benchmark saturation looks like in practice as of early 2026:

MMLU and MMLU-Pro: functionally saturated above 88% for frontier models, making score differences at the top statistically meaningless for procurement decisions
GSM8K: frontier models now reach 99% (GPT-5.3 Codex), rendering it useful only for evaluating smaller or fine-tuned models against base variants
MATH-500: at 96% for leading models, approaching the same ceiling that made MMLU uninformative
GPQA Diamond: sitting at 94.3% for frontier models despite being designed as a graduate-level science benchmark just two years ago.

The benchmark gap, explained: What AI leaderboards measure and what they miss

Enter humanity’s last exam

Humanity’s Last Exam (HLE), developed by the Center for AI Safety and Scale AI and published in Nature in January 2026, was specifically designed to resist this saturation.

Built from 2,500 questions sourced from nearly 1,000 subject-matter experts across 500 institutions, it filtered to problems that stumped GPT-4o and Claude 3.5 Sonnet at launch.

💡

The results are clarifying. The best frontier models currently score around 35% on HLE. Human domain experts average 90%.

That 55-point gap is a far more honest picture of where these models actually sit on genuinely hard reasoning tasks, and a useful corrective the next time a model changelog promises “significant reasoning improvements.”

The structural mismatch between benchmarks and production

Even a perfectly uncontaminated benchmark has a deeper problem: it measures a model in isolation on a fixed task, which is rarely how AI systems actually get used. A model evaluated on clean, well-formed prompts in a controlled environment is essentially a driver who only ever practiced in an empty parking lot.

Confident.

Fast.

Completely unprepared for the school run.

As MIT Technology Review has argued, AI systems are almost always deployed in ways that differ fundamentally from how they are benchmarked.

What production actually throws at your model

Production environments introduce variables that static benchmarks are structurally unable to capture:

Prompt injection attacks and adversarial inputs from real users (who are creative, bored, and occasionally out to cause chaos)
Latency constraints and SLA requirements that affect which responses are actually usable in practice
Cost variation: the CLEAR framework research found 50x cost variation across enterprise agentic systems achieving similar accuracy scores
Reliability degradation at volume: consistency dropping from 60% to 25% under production load conditions, per the same research
Compliance and policy requirements that standard benchmarks leave entirely unaddressed

💡

The 37% lab-to-production gap in agentic systems is a direct consequence of benchmarks optimizing for task completion accuracy while enterprises need holistic performance across all of the above.

A model that scores 91% on SWE-bench Verified may still stumble on the prompt injection, access control, and error recovery requirements of an actual production coding agent. The leaderboard has yet to add a column for “falls over when a user pastes something unexpected.”

The emerging evaluation stack

The research community has been building toward more defensible evaluation for several years.

The approaches gaining traction in 2026 share a common logic: make the benchmark harder to game by making it harder to predict.

Benchmarks designed to stay ahead:

LiveBench refreshes tasks on a rolling schedule, sourcing from recent publications and events that fall after model training cutoffs
LiveCodeBench continuously collects newly released programming problems, so score increases must reflect genuine improvement rather than memorization
SWE-bench Verified moved from isolated function generation to real GitHub issues requiring working patches validated by unit tests. As of March 2026, Claude Opus 4.5 leads at 80.9%.

The layered enterprise approach

For enterprise teams, the Kili Technology benchmark guide published in May 2026 recommends stacking evaluation in three layers: automated metrics for coverage, LLM-as-a-judge for screening, and human expert review for domain-specific correctness.

💡

The human expert layer is the part most teams skip in the interest of speed. It is also the part that most reliably catches the failures that matter. Skipping it is roughly the evaluation equivalent of skipping the last mile of a marathon because you are almost there.

What rigorous evaluation actually looks like

An eval program that predicts production performance requires shifting the question from “what score does this model achieve?” to “does this model behave reliably under the conditions we will actually run it in?” That reframe sounds small. It changes everything about how you build your eval suite.

What a production-grade eval suite covers

A production-grade eval suite covers:

Task-specific evals built from your own data distribution, covering the edge cases and adversarial inputs that generic benchmarks ignore
Latency, cost-per-task, and failure mode tracking alongside accuracy, giving a picture that maps to real decisions
Multi-step task completion evaluated under realistic tool constraints for agentic systems, with human-in-the-loop checkpoints that reflect how the system will actually be operated

The teams making the most of enterprise AI in 2026 are running automated evaluations on every prompt, model, or tool change before deployment, according to AI agent adoption research published by Digital Applied in April 2026.

That discipline is tedious, unglamorous, and completely invisible to anyone who writes analyst reports about AI adoption.

It is also what separates the 14% of enterprises that have successfully scaled agents to production from the 78% still running pilots and wondering why things keep breaking.

Final thoughts

Benchmark scores are a useful starting point for model selection. The problem is the industry has spent years treating them as a finishing point, and the gap between leaderboard performance and production reality is the bill coming due.

💡

The good news: rigorous evaluation is a solvable problem. The tooling is maturing, the frameworks exist, and the teams who have done the work are seeing the results.

The honest ask is committing the time and resources to build eval programs that reflect your actual deployment conditions rather than the idealized ones that happen to match the standard benchmarks.

“The benchmark said it was fine” is an answer that production environments will test, patiently, every single day. The better answer is knowing exactly where your model stands before it ever gets there.

admin
Jun, Fri, 2026

AI News

Jinhua Zhao named head of the Department of Urban Studies and Planning

Jinhua Zhao MCP ’04, SM ’04, PhD ’09 has been appointed head of the Department of Urban Studies and Planning (DUSP), effective July 1. Zhao is the Class of 1941 Professor of Cities and Transportation at MIT.

In making the announcement, dean of the MIT School of Architecture and Planning Hashim Sarkis noted that Zhao is a renowned transportation planner, educator, and scholar, and a world leader in imagining and shaping better futures for mobility.

“Jinhua is one of those rare scholars who moves seamlessly between cutting-edge research and real-world policy,” says Sarkis. “His work with governments and transportation agencies around the world is a model for what MIT’s impact can look like beyond our campus.”

Zhao succeeds Professor Christopher Zegras, who has served as department head since 2020. Under his leadership, DUSP expanded opportunities for students to engage directly with communities and policymakers around the world and continued to strengthen its long-standing connection between research and practice. “I want to extend my gratitude to Chris Zegras for his excellent and level-headed leadership, especially in challenging times,” says Sarkis.

After earning advanced degrees at MIT, Zhao joined the DUSP faculty. He says he found the Institute’s lack of conventionality and its culture of sharing ideas across disciplines stimulating.

“MIT is a small school in the best sense of the word,” says Zhao. “We have fewer boundaries than other universities — intellectually and physically. Our ‘infinite corridor’ literally connects us to so many disciplines.”

Shaping mobility systems worldwide

That connectivity has been key for Zhao’s research and programs he has founded at MIT. Respected as a global authority on mobility, his research has been put into practice across some of the world’s most complex mobility challenges. He and his team have shaped policy for Transport for London, the Mass Transit Railway in Hong Kong, and Japan Railways. His research has positively impacted leading U.S. transit authorities including Boston’s MBTA, the Chicago Transit Authority, and Washington’s Metropolitan Area Transit Authority. He has guided strategic planning for mobility industry on the future of autonomous and digital mobility, and developed autonomous vehicle (AV) deployment strategy in Singapore and the Middle East.

“Every city I’ve worked with faces the same tension: The technology is moving faster than the institutions designed to govern it,” says Zhao. “My work has been about closing that gap.”

At MIT, Zhao founded the MIT Mobility Initiative, which engages mobility and transportation researchers across the Institute as well as leaders in these disciplines from around the world. Zhao hosts the weekly MIT Mobility Forum via Zoom, with each discussion open to the public. What began as a small internal list of participants has grown into a global platform, drawing more than 200 practitioners, policymakers, and researchers every week around the world. The sizeable interest in the subject doesn’t surprise Zhao.

“No single discipline owns transportation,” says Zhao. “AI and autonomous systems are reshaping urban living faster than most institutions can adapt. The question is no longer what we know. It is whether the people who need it most — municipal governments, transport agencies, federal ministries — can access it when they make decisions on transportation. This is why the forum exists.”

Zhao directs the JTL Urban Mobility Lab that unites behavioral science and transportation technology to shape travel behavior, design mobility systems, and improve transportation policies. He is also a lead principal investigator with Mens, Manus, and Machina, an MIT initiative at the intersection of artificial intelligence, the future of work, and human learning, developing the tools and strategies for how cities, institutions, and economies can be designed to ensure AI augments, rather than displaces, the people within them.

DUSP’s global agenda

“If you look at the global agenda, what are the issues people are facing?” asks Zhao. “An aging society; AI and its impact on jobs; the energy crisis; traffic congestion. These are just some of the problems people feel connected to because they are embodied in our cities and communities. I want DUSP to engage with the city leaders and share our research and insights.”

As he prepares to step into his role as department head, Zhao says he would like the research generated within DUSP to more quickly reach those who need it most: the planners, officials, and engineers making decisions in cities right now. A transit authority grappling with AV integration; a city government rethinking aging infrastructure; a leading transport ministry navigating the policy implications of AI — these are the constituencies Zhao believes DUSP should be in active conversation with.

“We know a great deal about how cities grow, how people move, and how that will change. The question is whether the people responsible for making these changes — in city halls, transport agencies, federal ministries — can access what we know, when they need it.”

admin
Jun, Fri, 2026

AI News

Software Update Automatically Turns off Amazon Delivery Drivers’ AC During Dangerous Summer Heat

A software update to some Amazon delivery vehicles is automatically turning off the air conditioning after a few seconds if the driver is not in their seat, according to multiple Amazon delivery drivers who are complaining about the update online.

According to Amazon delivery drivers, the new update is for the Amazon EDV (electric delivery vehicle), the custom-built Rivian van. Delivery drivers say that this update automatically turns off the air conditioning in the van if the driver is not in the vehicle for more than 30 seconds. Drivers are complaining about the update as the start of the summer season, which can be particularly difficult and dangerous for delivery drivers.

“As many of you are aware, the EDVs just got a software update where if you are out of your seat for 30 seconds with the side door open, the AC switches off,” one Amazon delivery driver said in an online forum for drivers. “We all hate this obviously.”

When reached for comment an Amazon spokesperson said that the premise of my questions to the company was inaccurate, but conceded that the van will turn off the AC after 30 seconds under certain conditions that are commonplace during Amazon delivery shifts.

“Rivian recently released a software update for Electric Delivery Vehicles that actually extends climate control for drivers,” the Amazon spokesperson said. “As a result, the AC now runs for up to 10 minutes after a driver exits the vehicle, ensuring a cool cabin when they return. The timer resets at every stop. The AC only shuts off if the driver sliding door is left open for more than 30 seconds — a battery conservation measure.”

Amazon delivery drivers discussing the update online say that they are getting in and out of the van so frequently, and are spending most of their time out of the van delivering packages, that the update makes it harder to keep the van cool.

“Thing is we are up and about waaaay longer than we are driving so the ac turns off and when it turns on again we are already getting up before im the air is even cold,” one driver said. “It effectively made the ac not work and those vans get hot as fuuuck.”

“Every Amazon-branded vehicle is air-conditioned—a feature that exceeds the industry standard—and if the air-conditioning isn’t working in a vehicle, that vehicle is taken out of service immediately,” the Amazon spokesperson said. “They also have cooling seats for drivers. This update was intentionally timed ahead of summer to improve driver comfort during the hottest months of the year. Driver safety and comfort in extreme temperatures remains a priority. If drivers have questions about this change, they should touch base with the DSP they work for – as details about this change were shared with them.”

Older delivery trucks may not have air conditioning or have air conditioning that breaks often. Delivery drivers for UPS, who are represented by the Teamsters union, negotiated a heat safety agreement with the company in 2023. Amazon has publicly outlined its strategy for keeping all its workers, including delivery drivers, safe during the heat, including using an app to ask drivers to take 10-minute break from the heat by resting in a cool place and drinking water, but Amazon delivery drivers are managed by a nationwide network of subcontractors who drivers say don’t always maintain those standards.

As you’ve probably seen in your own neighborhood, delivery drivers will often park their vans wherever they can and deliver packages to multiple addresses on the same block. Amazon automatically turning off the air conditioning while they are out of the van delivering packages means the van can get hot again by the time they get back. As Amazon delivery drivers have to make frequent stops, it’s not hard to imagine why drivers would complain about Amazon automatically shutting down the AC, which makes it more difficult to cool down between stops.

Category AI News

A visit to the cetacean cemetery

Bathroom blast from the past

Eat your galactic green peas

Journey to the magic spyglass in space

How benchmarks became a leaderboard sport

Where the incentives went wrong

The contamination problem runs deeper than most teams realize

Claude responded

What the numbers actually measure

Enter humanity’s last exam

The structural mismatch between benchmarks and production

What production actually throws at your model

The emerging evaluation stack

The layered enterprise approach

What rigorous evaluation actually looks like