Ray's Curation

Friday, October 3, 2025

We’re introducing GDPval, a new evaluation that measures model performance on economically valuable, real-world tasks across 44 occupations. - OpenAI

We found that today’s best frontier models are already approaching the quality of work produced by industry experts. To test this, we ran blind evaluations where industry experts compared deliverables from several leading models—GPT‑4o, o4-mini, OpenAI o3, GPT‑5, Claude Opus 4.1, Gemini 2.5 Pro, and Grok 4—against human-produced work. Across 220 tasks in the GDPval gold set, we recorded when model outputs were rated as better than (“wins”) or on par with (“ties”) the deliverables from industry experts, as shown in the bar chart below.... We also see clear progress over time on these tasks. Performance has more than doubled from GPT‑4o (released spring 2024) to GPT‑5 (released summer 2025), following a clear linear trend. In addition, we found that frontier models can complete GDPval tasks roughly 100x faster and 100x cheaper than industry experts.

https://openai.com/index/gdpval/

Thursday, October 2, 2025

We urgently call for international red lines to prevent unacceptable AI risks. - AI Red Lines

Some advanced AI systems have already exhibited deceptive and harmful behavior, and yet these systems are being given more autonomy to take actions and make decisions in the world. Left unchecked, many experts, including those at the forefront of development, warn that it will become increasingly difficult to exert meaningful human control in the coming years. Governments must act decisively before the window for meaningful intervention closes. An international agreement on clear and verifiable red lines is necessary for preventing universally unacceptable risks. These red lines should build upon and enforce existing global frameworks and voluntary corporate commitments, ensuring that all advanced AI providers are accountable to shared thresholds. We urge governments to reach an international agreement on red lines for AI — ensuring they are operational, with robust enforcement mechanisms — by the end of 2026.

https://red-lines.ai/?utm_source=newsletter.theaireport.ai&utm_medium=newsletter&utm_campaign=200-world-leaders-demand-ai-red-lines&_bhlid=11a58d14d5cf5f15774f2f52256329b3db7ab7ea

Wednesday, October 1, 2025

AI Hallucinations May Soon Be History - Ray Schroeder, Inside Higher Ed

On Sept. 14, OpenAI researchers published a not-yet-peer-reviewed paper, “Why Language Models Hallucinate,” on arXiv. Gemini 2.5 Flash summarized the findings of the paper: "Systemic Problem: Hallucinations are not simply bugs but a systemic consequence of how AI models are trained and evaluated. Evaluation Incentives: Standard evaluation methods, particularly binary grading systems, reward models for generating an answer, even if it’s incorrect, and punish them for admitting uncertainty. Pressure to Guess: This creates a statistical pressure for large language models (LLMs) to guess rather than say “I don’t know,” as guessing can improve test scores even with the risk of being wrong."

https://www.insidehighered.com/opinion/columns/online-trending-now/2025/10/01/ai-hallucinations-may-soon-be-history

Tuesday, September 30, 2025

Charting the GenAI Blue Ocean: A paradigm shift in business education - Bert Verhoeven, Dr Vishal Rana, Dr Timothy Hor - University of Oxford

The rise of Generative AI (GenAI) signals not just technological progress but a seismic shift in how industries innovate, compete, and create value. Beyond chatbots and workflow automation, GenAI’s potential lies in its ability to personalise experiences, analyse data in real time, and redefine market opportunities. In an era where traditional competition—marked by diminishing margins in "red oceans"—feels increasingly obsolete, the fusion of GenAI with Kim and Mauborgne’s (2005) concept of the Blue Ocean Strategy unlocks new frontiers of innovation, enabling Higher Education to transcend zero-sum competition and imagine entirely new paradigms, reconfiguring the relationship between institutions, teachers, learners, and markets. Blue Ocean Strategy focuses on creating new, uncontested market spaces by redefining industry boundaries and delivering unique value to customers. It shifts the focus from competing in existing markets to innovating and unlocking new demand.

https://aieou.web.ox.ac.uk/article/charting-genai-blue-ocean

Monday, September 29, 2025

US faces shortfall of 5.3M college-educated workers by 2032 - Laura Spitalniak, Higher Ed Dive

Nursing, teaching and engineering would experience the largest gaps, per a study from Georgetown University’s Center on Education and the Workforce. The U.S. will need over 5 million additional workers who have at least some postsecondary education by 2032, according to a report released Tuesday by Georgetown University’s Center on Education and the Workforce. Of that total, 4.5 million will need at least a bachelor’s degree, according to the report. Degree-requiring positions facing “critical skills shortages” include nurses, teachers and engineers, it said.

https://www.highereddive.com/news/us-faces-shortfall-of-53m-college-educated-workers-by-2032/760155/

Sunday, September 28, 2025

Learning analytics-informed teaching strategies: enhancing interactive learning in STEM education - Ying Zheng &Dexian Li, Taylor and Francis Online

Using a mixed-methods approach, data were collected from 1,483 students and 95 teachers through random and purposive sampling. The findings indicate adaptive learning technologies significantly improve student performance by tailoring instruction to individual needs. Real-time educational data analysis enables early identification of disengagement, facilitating timely interventions. Additionally, insights into student interaction patterns inform the development of evidence-based teaching strategies that foster critical thinking and problem-solving skills. The study highlights the transformative role of educational data mining in creating immersive learning environments that enhance conceptual understanding and practical application, reducing achievement gaps among diverse student populations.

https://www.tandfonline.com/doi/full/10.1080/10494820.2025.2553113

Saturday, September 27, 2025

The infrastructure moment - Alastair Green, Ishaan Nangia, and Nicola Sandri - McKinsey

A confluence of global forces is accelerating the need for infrastructure investment. Outdated assets, rapid urbanization, geopolitical shifts, and technological advancements are exposing the limitations of yesterday’s infrastructure. These forces are also changing the very definition of infrastructure. Traditionally, the term has been synonymous with assets such as power grids, roads, ports, and bridges. More recently, advances in technology have meant that newer assets such as fiber-optic networks, hyperscale data centers, and electric-vehicle charging stations are increasingly vital. These modern types of infrastructure share traits with “traditional” infrastructure, including long lifespans, significant initial investment, predictable and resilient cash flows, and critical economic roles.

https://www.mckinsey.com/industries/infrastructure/our-insights/the-infrastructure-moment

Friday, September 26, 2025

Linking digital competence, self-efficacy, and digital stress to perceived interactivity in AI-supported learning contexts - Jiaxin Ren, Nature

As artificial intelligence technologies become more integrated into educational contexts, understanding how learners perceive and interact with such systems remains an important area of inquiry. This study investigated associations between digital competence and learners’ perceived interactivity with artificial intelligence, considering the potential mediating roles of information retrieval self-efficacy and self-efficacy for human–robot interaction, as well as the potential moderating role of digital stress. Drawing on constructivist learning theory, the technology acceptance model, cognitive load theory, the identical elements theory, and the control–value theory of achievement emotions, a moderated serial mediation model was tested using data from 921 Chinese university students. The results indicated that digital competence was positively associated with perceived interactivity, both directly and indirectly through a sequential pathway involving the two forms of self-efficacy.

https://www.nature.com/articles/s41598-025-18873-3

Google Notebook LM’s Capabilities and Impact: Expert analysis from - Agentic Brain, AI Report

The rapid expansion of artificial-intelligence tools has produced dozens of note-taking and research assistants, but few have delivered a coherent, end-to-end learning experience. Google’s Notebook LM stands out because it blends multimodal analysis, grounded responses and interactive learning aids into a single platform. Released in 2023 and continuously updated, Notebook LM has quickly become one of the most impressive AI-enhanced research agents available today. Unlike traditional chatbots that draw on general internet knowledge, Notebook LM grounds every response in the documents you provide. Uploads can include PDFs, Google Docs, Google Slides, websites, YouTube videos, audio files or plain text. Once added, the system becomes an “instant expert” on your materials. You can converse with it in a familiar chat interface or any of the following incredibly diverse capabilities

https://www.theaireport.ai/partner-columns/google-notebook-lms-capabilities-and-impact

Thursday, September 25, 2025

The Declining ROI of MBA Degrees and the Rise of Alternative Skill-Building Platforms - Eli Grant, AInvest

- Micro-credentials ($500–$5K) offer faster ROI (12–18 months) with 15%–30% salary boosts, prioritizing job-ready skills over generalized degrees.

- Employers increasingly value micro-credentials equally to MBAs (68% LinkedIn survey), reflecting a shift toward skills over degree prestige.

- Hybrid models (e.g., MIT MicroMasters) and AI-driven learning platforms are reshaping education by blending affordability with personalized, on-demand upskilling.

https://www.ainvest.com/news/declining-roi-mba-degrees-rise-alternative-skill-building-platforms-2509/

Wednesday, September 24, 2025

Google narrows the gap with ChatGPT as millions tap Nano Banana to make hyperrealistic 3D figurines. - Robert Hart, the Verge

The surge has likely propelled Gemini to the top of various app stores around the world. At the time of writing, Gemini is the leading iPhone app on Apple’s App Stores in the US, UK, Canada, France, Australia, Germany, and Italy. In many cases, it reached the prime position by surpassing OpenAI’s ChatGPT, which now sits in second place. On September 11th, Woodward said “India has found” the image editor and later said that Google was going to have to implement “temporary limits” on usage in order to manage extreme demand. “It’s a full-on stampede to use” Gemini, he said, adding that the “team is doing heroics to keep the system up and running.” So, what’s driving the surge? While a variety of edits have been popular, the runaway hit of Nano Banana has people turning themselves — or their pets — into 3D figurines.

https://www.theverge.com/news/778106/google-gemini-nano-banana-image-editor

Tuesday, September 23, 2025

First-of-its-kind AI tool to save 75% of academics’ time - Sara AlKuwari, Khaleej Times

Hamdan Bin Mohammed Smart University (HBMSU) in the United Arab Emirates has announced the launch of the region’s first AI-powered academic agent, a pioneering tool designed to save up to 75% of faculty members’ time while enhancing students’ academic achievement by 40%, marking a significant step in reshaping the future of higher education, writes Sara AlKuwari for Khaleej Times. The initiative, titled Artificial Intelligence Agent for Every Faculty, is the first of its kind in the UAE and the wider region. It integrates advanced AI capabilities into higher education in line with the UAE National Artificial Intelligence Strategy 2031 and Education Strategy 2033.

https://www.universityworldnews.com/post.php?story=20250913105932566

Monday, September 22, 2025

White House AI Task Force Positions AI as Top Education Priority - Julia Gilban-Cohen, GovTech

When Trump administration officials met with ed-tech leaders at the White House last week to discuss the nation’s vision for artificial intelligence in American life, they repeatedly underscored one central message: Education must be at the heart of the nation’s AI strategy. Established by President Trump’s April 2025 executive order, the White House Task Force on AI Education is chaired by director of science and technology policy Michael Kratsios, and is tasked with promoting AI literacy and proficiency among America’s youth and educators, organizing a nationwide AI challenge and forging public-private partnerships to provide AI education resources to K-12 students.

https://www.govtech.com/education/k-12/white-house-ai-task-force-positions-ai-as-top-education-priority

Sunday, September 21, 2025

The common future of humans and artificial intelligence will be “hybrid professions”! - Uskudar University (Turkey)

On the place that “hybrid professions,” where humans and AI work together, will hold in the future, Dr. İldiz explained: “The definition of a hybrid profession is shaped by how much you can adapt to AI, how you integrate it into your life, and the boundaries you set with your professional expertise. This can provide a future where we do not lose our human aspects but continue to grow, both for ourselves and for our world.”

https://uskudar.edu.tr/en/new/the-common-future-of-humans-and-artificial-intelligence-will-be-hybrid-professions/62826

Saturday, September 20, 2025

Got AI skills? You can earn 43% more in your next job - and not just for tech work - Webb Wright, ZDnet

Demand for AI skills is on the rise across industries. A single AI skill makes a huge difference in listed salaries. Different industries are looking for different AI skills. As businesses race to adopt AI, they're placing a higher premium on job candidates who know their way around the technology. A recent study from labor market research firm Lightcast found that jobs requiring AI-related skills offer higher annual salaries than those that don't. This is true not only in tech-heavy industries like IT and computer science but also across a range of other sectors.

https://www.zdnet.com/article/got-ai-skills-you-can-earn-43-more-in-your-next-job-and-not-just-for-tech-work/

Friday, September 19, 2025

Did OpenAI just solve hallucinations? - Matthew Berman, YouTube

The video explains that hallucinations are ingrained in the models' construction, functioning more as features than bugs. This is compared to human behavior, where guessing on a test might be rewarded, leading models to guess rather than admit uncertainty. The core issue is the absence of a system that rewards models for expressing uncertainty or providing partially correct answers. The proposed solution involves creating models that only answer questions when they meet a certain confidence threshold and implementing a new evaluation system. This system would reward correct answers, penalize incorrect ones, and assign a neutral score for "I don't know" responses. The video concludes by suggesting that the solution lies in revising how models are evaluated and how reinforcement learning is applied. (summary provided in part by Gemini 2.5 Plus)

youtube.com/watch?si=bKmPqaT8ihGTtWDn&v=xGO5Q94XXf0&feature=youtu.be

Thursday, September 18, 2025

How AI Impacts Academic Thinking, Writing and Learning - Does AI make for better grades or better thinkers? - Michael Hogan, et al; Psychology Today

Over-reliance on AI risks eroding students’ knowledge and skill development through reduced cognitive effort. In writing tasks, findings suggests that students primarily prompt ChatGPT for data, facts, and information. Educators need activity designs that encourage questioning and verification rather than blind AI acceptance.

https://www.psychologytoday.com/us/blog/in-one-lifespan/202509/how-ai-impacts-academic-thinking-writing-and-learning

Wednesday, September 17, 2025

OPINION: AI can be a great equalizer, but it remains out of reach for millions of Americans; we cannot let that continue - Erin Mote, Hechinger Report

This digital divide is a persistent crisis that deepens societal inequities, and we must rally around one of the most effective tools we have to combat it: the Universal Service Fund. The USF is a long-standing national commitment built on a foundation of bipartisan support and born from the principle that every American, regardless of their location or income, deserves access to communications services. Without this essential program, over 54 million students, 16,000 healthcare providers and 7.5 million high-need subscribers would lose internet service that connects classrooms, rural communities (including their hospitals) and libraries to the internet.

https://hechingerreport.org/opinion-ai-can-be-a-great-equalizer-but-it-remains-out-of-reach-for-millions-of-americans-we-cannot-let-that-continue/

Tuesday, September 16, 2025

OPINION: Schools cannot teach AI literacy without a way to measure it - Amit Sevak, Hechinger Report

Everywhere you look, someone is telling students and workers to “learn AI.” It’s become the go-to advice for staying employable, relevant and prepared for the future. But here’s the problem: While definitions of artificial intelligence literacy are starting to emerge, we still lack a consistent, measurable framework to know whether someone is truly ready to use AI effectively and responsibly. And that is becoming a serious issue for education and workforce systems already being reshaped by AI. Schools and colleges are redesigning their entire curriculums. Companies are rewriting job descriptions. States are launching AI-focused initiatives.

https://hechingerreport.org/opinion-schools-cannot-teach-ai-literacy-without-a-way-to-measure-it/

Monday, September 15, 2025

Duke University pilot project examining pros and cons of using artificial intelligence in college - AP

As part of a new pilot with OpenAI, all Duke undergraduate students, as well as staff, faculty and students across the University’s professional schools, gained free, unlimited access to ChatGPT-4o beginning June 2. The University also announced DukeGPT, a University-managed AI interface that connects users to resources for learning and research and ensures “maximum privacy and robust data protection.” Duke launched a new Provost’s Initiative to examine the opportunities and challenges AI brings to student life on May 23. The initiative will foster campus discourse on the use of AI tools and present recommendations in a report by the end of the fall 2025 semester.

https://www.wral.com/story/duke-university-pilot-project-examining-pros-and-cons-of-using-artificial-intelligence-in-college/22146542/

Sunday, September 14, 2025

Worst to first: What it takes to build or remake a world-class team - Kevin Carmody, Mark Hojnacki, and Rick Gold with Shayne Skov; McKinsey

Building a team is hard; building a winning team is even harder. For every organization that manages to achieve the right mix of talent, culture, and performance expectations, many more find themselves lacking in one area or another. Consider the following cautionary tales. One team of “superstars” in a large technology organization failed to gel simply because they could not agree on working norms. Another high-performing group underachieved because the executive team and line managers had very different views of their roles: Executives were frustrated by line managers’ hesitancy to make and own critical decisions, while the line managers were afraid to be labeled as failures by these same executives if their moves deviated too far from the status quo. Both sides pointed fingers at each other when outcomes failed to meet expectations.

https://www.mckinsey.com/capabilities/transformation/our-insights/worst-to-first-what-it-takes-to-build-or-remake-a-world-class-team