The CEMFI Summer School aims to provide practitioners and academics from all over the world with an opportunity to update their training in fields within CEMFI's range of expertise.
A variety of one-week courses are offered each year, during late August and early September.
The courses are taught by leaders in their fields. They are based on innovative teaching practices that combine regular lectures and personalized interaction between the instructor and course participants. Courses typically combine formal lectures, discussion sessions and, in some cases, workshop sessions where some participants can discuss their work. In more applied courses, practical classes outside the regular schedule are organized to provide additional hands-on experience. A course manager is assigned to each course to coordinate all activities.
In person courses: Each course has between 10 and 36 participants. It consists of five daily sessions that lasts three and a half hours, including a 30-minute break, and can take place either in the morning or in the afternoon. In the evening of the second day of each course, the School organizes a course dinner aimed at providing participants and instructors with an occasion to interact in a relaxed atmosphere.

Instructor: Christopher Rauh (University of Cambridge)
Dates: 17-21 August 2026
Hours: 9:30 to 13:00 CEST
Format: In person
Intended for
Academic researchers, policy analysts, data analysts, and consultants who are using, or who wish to use, unstructured data sources such as text, detailed surveys, images, or speech data in their work.
Prerequisites
A basic familiarity with probability and statistics at advanced undergraduate level. The hands-on classes will require students to work through Python notebooks that will be prepared in advance. Extension problems will involve the modification of these notebooks, which requires familiarity with the basics of Python. An introductory session to Python will be provided by a teaching assistant. Therefore, previous programming experience in other languages is sufficient.
Overview
Over the past decade, unstructured data such as text, images, and audio have become increasingly important in economics and related fields. At the same time, the rapid development of large language models (LLMs) and generative AI has fundamentally changed how researchers work with these data. Tools such as ChatGPT and related models now allow economists not only to classify text or measure sentiment, but also to generate text, build forecasts, and adapt models to specific research contexts through fine-tuning. Together, these advances have opened up new possibilities for extracting information, automating research tasks, and developing AI-augmented economic models.
This course equips participants with the conceptual understanding and practical skills needed to work effectively with unstructured data, large language models, and generative AI in economic research. The emphasis is on combining intuitive theoretical insights with hands-on implementation. Participants will gain experience fine-tuning LLMs, building AI-based analytical pipelines, and applying generative models to modern empirical questions in economics.
The course is organized around five main components:
1. Analytical techniques. Students are introduced to key statistical and machine-learning methods used to analyze unstructured data. The course will cover predictive models such as neural networks and random forests. Rather than focusing on formal derivations, the course emphasizes intuition and practical relevance, with particular attention to applications in natural-language processing and generative AI.
2. Large language models and generative AI. This component covers the architecture, training, and use of modern LLMs in economic research. Topics include text embeddings and transformer models (such as BERT, GPT, and LLaMA), fine-tuning and prompt design for domain-specific applications, and the use of generative models to create synthetic data for simulation and forecasting. The course also addresses key challenges, including interpretability, bias, and ethical concerns.
3. Economic applications. We will explore how these tools can be applied to concrete research problems, including topic modeling and sentiment analysis of policy debates and financial markets, fine-tuned LLMs for economic forecasting and macroeconomic analysis, and generative AI for survey design, automation, and synthetic data generation.
4. Hands-on implementation. Through guided coding sessions, participants will apply the methods covered in class to real-world datasets. Using Python and tools such as Hugging Face Transformers, they will build custom NLP pipelines, fine-tune language models on domain-specific corpora (for example, financial reports or policy documents), and integrate AI-based methods into their own research workflows.
5. Data collection and preparation. The final component focuses on the practical challenges of working with unstructured data. Topics include web scraping and API-based data collection, data cleaning and preprocessing pipelines, structuring data for machine learning models, and ethical considerations when working with AI-generated content.
Topics
- Text analysis, including tf-idf and topic modeling (Latent Dirichlet Allocation - LDA)
- Transformer-based language models, pretraining, and fine-tuning for economic tasks
- Evaluation of AI-based predictions, including accuracy, precision-recall, and interpretability
- Image analysis and classification using convolutional neural networks and transfer learning
- Web scraping and automated extraction of online data
- Speech-to-text methods and sentiment analysis of spoken language
- Generative AI for text generation, synthetic data, and AI-assisted research
Practical Classes
There will be some voluntary sessions in the afternoon (from 15:00 to 17:00) led by a teaching assistant. Exact dates will be announced before the beginning of the course.
- Professor: Christopher Rauh

Instructor: Stephane Bonhomme (University of Chicago)
Dates: 17-21 August 2026
Hours: 9:30 to 13:00 CEST
Format: In person
Intended for
Applied researchers and econometricians interested in estimating economic models using panel data.
Prerequisites
Master’s-level courses in probability and statistics, and econometrics.
Overview
This course provides applied researchers and econometricians with tools to estimate dynamic models using panel data. Particular emphasis is placed on relaxing strict exogeneity assumptions, which are widely used in empirical work-including difference-in-differences designs and event studies-but are often economically and empirically restrictive.
We will review classic methods for estimating linear dynamic panel data models with sequentially exogenous covariates. We will then discuss more recent approaches that extend these methods to settings with nonlinearities, coefficient heterogeneity, and network dynamics.
Topics
- Conceptual foundations: strict exogeneity, sequential exogeneity, and feedback
- Bias in models with dynamic feedback
- Implications for difference-in-differences and event-study designs
- Classic dynamic panel data methods I: GMM
- Classic dynamic panel data methods II: quasi-likelihood and large-T approaches
- Coefficient heterogeneity in models with dynamic feedback
- Nonlinear models with dynamic feedback
- Dynamic models in networks

Instructor: Sophia Kazinnik (Stanford University)
Dates: 17-21 August 2026
Hours: 15:00 to 18:30 CEST
Format: In person
Intended for
Researchers, central bank economists, and PhD students who want to use large language models (LLMs) in data-driven research. Researchers working at the intersection of monetary and financial economics who want to go beyond basic text analysis and use LLMs for things like better measurement, generating new research ideas or data, and running agent-based simulations.
Prerequisites
A basic familiarity with coding, econometrics, and statistics. The lectures will use Python notebooks prepared in advance. No prior deep learning experience is required. The course is designed for applied researchers who want to build reproducible workflows with AI that work in real-world settings (including working with confidential data) and who are interested in using LLMs for measurement, generation, and agent-based simulation, particularly in the context of monetary and financial economics.
Overview
Monetary and financial economics increasingly hinge on language and communication, whether we are trying to decode central bank signals, measure narratives in markets, or understand how households and firms form expectations. This course focuses on three complementary ways modern AI can be used in research:
- Extracting economic signals from text (and, when relevant, audio/video): turning complex documents and texts into interpretable measures that can enter standard empirical designs (e.g., stance, uncertainty, risk, guidance, narrative shifts, etc.).
- Producing structured research outputs with generative models: using LLMs to create objects economists actually use: forecast variables, scenario consistent summaries, structured extractions, synthetic survey responses, counterfactual message variants, and scalable measurement targets.
- Simulating economic actors and institutions: building agent-based systems (e.g., forecasters, committee members, depositors, etc.) that interact and deliberate under controlled information sets, creating an “in silico” laboratory for counterfactual policy and market experiments.
A core theme throughout is measurement discipline: we treat LLMs as powerful but fallible instruments and emphasize validation, data leakage safeguards, and error correction, especially when models are used to generate labels or behavioral outcomes at scale.
There are three parts to the course:
- Text as Data (Measurement as Data Creation). Practical approaches for extracting economic signals from text: stance, risk, uncertainty, narratives, and regime shifts, paired with transparent evaluation and error analysis.
- Generation as a Research Tool. Using LLMs to generate research-relevant outputs (e.g., forecast objects, structured extractions, synthetic survey responses, and scalable measurement targets) while being able to validate and “explain” these outputs and enforce constraints when needed.
- Agents and Simulations. Building agent-based workflows that combine personas, real-time/vintage data, and interaction protocols to study phenomena such as expectations formation and decision-making, linking micro-level reasoning and communication to macro/market outcomes.
Topics
- AI for central banking and financial texts: stance and intent classification; information extraction; narrative and uncertainty measurement; explainability and systematic evaluation
- LLMs as measurement instruments: human anchoring, bias correction, robustness to prompts/models, and safeguards against data leakage
- Generative AI for research workflows: structured extraction and generation (schemas/JSON); forecast generation; synthetic survey responses and scalable labeling; retrieval augmented generation (RAG); model finetuning
- Agents and multi-agent simulations: personas, controlled information sets, deliberation protocols, and counterfactual experimentation for policy and markets, model steering
- Best research practices: reproducibility, versioning, and deployment constraints in applied policy/finance environments

Instructors: Pol Antràs (Harvard University)
Dates: 24-28 August 2026
Hours: 9:30 to 13:00 CEST
Format: In person
Intended for
Academics, researchers, practitioners, and graduate students interested in international economics.
Prerequisites
First-year Master or PhD level in international economics.
Overview
Topics
- Measuring global production sharing in the world economy via World Input-Output Tables
- Firm-level empirical approaches to GVC participation
- GVC and quantitative trade theory. The view from Macro
- Micro-level approaches to modelling GVCs
- GVCs and trade and industrial policy
- The Future of Globalization

Instructor: Jeffrey Wooldridge (Michigan State University)
Dates: 24-28 August 2026
Hours: 9:30 to 13:00 CEST
Format: In person
Intended for
Empirical researchers and applied econometricians with an interest in recent advances in difference-in-differences estimation.
Prerequisites
A master level course in probability and statistics -- including a basic understand of asymptotic tools -- and a course in econometrics at the level of W.H. Greene (2018), Econometric Analysis, 8th edition, F. Hayashi (2000), Econometrics, or B. Hansen (2023), Econometrics. Participants are expected to have a working knowledge of ordinary least squares, generalized least squares, two-way fixed effects, and basic nonlinear estimation methods. It is helpful to know about treatment effect estimation assuming unconfounded assignment. Derivations will be kept to a minimum except where making an essential point.
Overview
The purpose of the course is to provide applied economists with an update on some developments in intervention analysis using what are generally known as “difference-in-differences” methods. One theme is that flexible regression methods, whether based on pooled OLS or two-way fixed effects, can be very effective - in both common and staggered timing settings. Other methods that rely on commonly used treatment effects estimators, such as inverse probability weighting combined with regression adjustment, are easily motivated in a common framework. Imputation approaches to estimation, and their relationship to pooled methods, also will be discussed.
Special situations such as all units treated, exit from treatment, and time-varying controls also will be discussed. Event study approaches for detecting the presence of pre-trends, and accounting for heterogeneous trends using both regression and treatment effects estimator, also will be covered. As time permits, strategies with non-binary treatments also will be discussed.
The main focus is on microeconometric applications where the number of time periods is small. Nevertheless, some coverage of “large-T” panels is also included, including cases with few treated units. Simple strategies for small-N inference will be discussed and compared with synthetic control methods.
The course will end with a treatment of nonlinear difference-in-differences methods, with focus on binary, fractional, and nonnegative outcomes (including counts). Logit and Poisson regression an especially attractive for such applications. The final lecture will provide an overview of how regression-based methods extend to the case of repeated cross sections.
Topics
- Introduction and Overview. The T = 2 Case. No Anticipation and Parallel Trends. Controlling for Covariates via Regression Adjustment and Propensity Score Methods
- General Common Intervention Timing. Event Study Estimators and Heterogeneous Trends. Flexible Estimation with Covariates
- Staggered Interventions. Identification and Imputation. Pooled OLS and Extended TWFE. Aggregation
- All Units Eventually Treated. Event Study Estimators. Testing and Correcting for Violations of Parallel Trends. Equivalences of Estimators
- Imputation using Unit Fixed Effects. Rolling Methods. Long Differencing. Propensity Score and Doubly Robust Methods
- Strategies with Exit. Time-Varying Covariates. Unbalanced Panels
- Non-Binary Treatments
- Inference with few Treated Units. Time Series Approaches. Synthetic Control
- Nonlinear DiD. Binary, Fractional, and Nonnegative Responses
- DiD with Repeated Cross Sections.
- Professor: Jeffrey M. Wooldridge

Instructor: Eric Swanson (University of California, Irvine)
Dates: 24-28 August 2026
Hours: 15:00 to 18:30 CEST
Format: In person
Intended for
Academic researchers and policy analysts who are interested in using high-frequency financial market data to study the effects of conventional and unconventional monetary policies on domestic and foreign financial markets and their economies.
Prerequisites
A basic familiarity with macroeconomics, econometrics, probability, and statistics at the advanced undergraduate level.
Overview
Estimating causal effects in macroeconomics is often very difficult, because issues of omitted variables or reverse causality are often very plausible. High-frequency changes in financial markets around significant events (e.g., FOMC or OPEC announcements) represent one of the most appealing methods of causal identification. By focusing on narrow windows of time around these significant events, the problems of omitted variables and reverse causality are essentially eliminated, because they are dominated by the importance of the significant event itself within that narrow time window. These financial market changes can then be used in high-frequency OLS regressions to measure causal effects in financial markets, or as an instrumental variable for lower-frequency (e.g., monthly or quarterly) changes to measure causal effects in macroeconomic VARs or local projections regressions.
The goal of this course is to familiarize students with all the main tools using high-frequency event studies for causal identification in macroeconomic applications. We will place a heavy emphasis on examples from highly-cited research papers throughout the course.
Topics
- High-frequency measures of monetary policy shocks
- High-frequency measures of macroeconomic data release surprises
- Estimating the effects of conventional monetary policy and macroeconomic data releases on financial markets
- High-frequency measures of forward guidance
- High-frequency measures of large-scale asset purchases
- Estimating the effects of unconventional monetary policies (forward guidance and LSAPs) on financial markets
- The “Fed Information Effect”
- High-frequency evidence from prediction markets
- Estimating the effects of conventional and unconventional monetary policies on macroeconomic variables using high-frequency identification in VARs and local projections
- High-frequency instrument relevance and exogeneity conditions
- High-frequency measures of oil price, fiscal policy, and other shocks
- Monetary policy spillovers across countries
- Using VARs and LPs to evaluate policy counterfactuals
- Professor: Frank Schorfheide

Instructor: Gabriel Ahlfeldt (Humboldt University of Berlin)
Dates: 31 August - 4 September 2026
Hours: 9:30 to 13:00 CEST
Format: In person
Intended for
Academic researchers and practitioners interested in urban and spatial economics, including researchers working on policy evaluation, transportation, land use, housing, and regional development.
Prerequisites
Master level course in statistics, econometrics, and microeconomics. Basic familiarity with object-oriented programming concepts and MATLAB would be useful.
Overview
This course provides an introduction to quantitative spatial economics with a focus on theory, quantification, and counterfactual analysis. The goal is to equip participants with a clear understanding of how modern quantitative spatial models are constructed, how their primitives are quantified, and how such models can be used to evaluate counterfactual urban policies.
The course builds around the canonical quantitative urban model with commuting (Ahlfeldt, Redding, Sturm, Wolf, 2015). Lectures emphasize economic intuition and modeling assumptions, while closely following computational toolkits that allow participants to replicate key results and counterfactuals. Rather than requiring independent programming projects, the course adopts a scalable hands-on approach: participants can engage at different depths, ranging from following the theory and results, to actively working through the code and modifying counterfactual scenarios.
By the end of the course, participants will understand how quantitative spatial models translate economic primitives into equilibrium outcomes and how these models can be solved and used for policy-relevant counterfactual analysis within an existing coding infrastructure.
By the end of the course, participants should be able to:
- Understand the core assumptions underlying quantitative spatial models
- Identify the key primitives that need to be quantified in such models
- Understand how equilibrium outcomes are computed in spatial general equilibrium settings
- Interpret and evaluate counterfactual policy experiments in quantitative spatial models
- Work with an existing quantitative spatial model toolkit and understand its code-based implementation
The course consists of lectures that closely follow computational toolkits used to implement the models discussed. Participants are guided through the logic of the models alongside code that reproduces key results and counterfactuals. Selected parts of the code are discussed to illustrate how theoretical objects are translated into computational routines, allowing participants to engage with the material at their preferred level of technical depth.
Topics
- Spatial equilibrium and the quantitative Rosen-Roback model
- Preference heterogeneity and discrete location choice
- The quantitative urban model with commuting (ARSW model)
- Quantification of model primitives using data
- Counterfactual analysis and policy evaluation in quantitative spatial models
- Professor: Elena Manresa

Instructor: Òscar Jordà (Federal Reserve Bank of San Francisco and UC Davis)
Dates: 31 August - 4 September 2026
Hours: 9:30 to 13:00 CEST
Format: In person
Intended for
Academic researchers and policy analysts who are interested in modern multivariate time series methods to compute the dynamic effect of policy interventions and the method of local projections in particular.
Prerequisites
Some basic knowledge of probability or statistics is expected. Individuals with undergraduate degrees in economics, statistics or related disciplines should be able to follow the course. The emphasis will be on applications and practical aspects rather than on deep theory. The applications will primarily use the statistics software package STATA.
Overview
Applied economists are often interested in how an intervention will affect an economic outcome. When the data come in the form of a vector time series or a panel of data of a vector of variables for individual units observed over time, it is important to characterize the dynamic features of the problem in as general a manner as possible. The main objective of the course is thus to introduce the method of local projections (LPs) to examine how interventions affect outcomes over time in the context of general dynamic systems. The flexibility of LPs allow for convenient extensions to explore nonlinearities, state-dependence and policy evaluation more generally, in an easy and accessible way.
Over the past few years, there have been numerous extensions to LPs that will be discussed. These include estimation of multipliers and interpretation of impulse responses; new results on impulse response inference; a decomposition of the impulse response into the direct versus indirect effects of an intervention, and small-sample composition effects; simple linear in parameter methods to estimate time-varying impulse responses; stratification of impulse responses as a function of economic conditions and other nonlinear extensions, to name a few.
More recently, it has become more common to analyze panel data in macroeconomics. Panel data structures allow for richer options, especially on identification. The course will take advantage of these new developments, particularly in the area of difference-in-differences (DiD) identification. LP-DiD methods accommodate a wide range of recently proposed estimators of staggered, heterogeneous, treatment effects.
The breadth of topics covered limits the rigor with which each result will be discussed, though appropriate references will be provided for those interested. The goal of the course is to guide practitioners to appropriate methods for their problems, and to elicit fruitful extensions and avenues for new research. Applications of the methods discussed in class will use the econometrics software package STATA.
Topics
- Introduction to the main questions of interest: a local projection as the dynamic version of traditional policy evaluation. Connection to vector autoregressions and their impulse responses. Multipliers and interpretation of impulse responses under different specifications.
- Inference with local projections.
- Identification.
- Smoothing methods and economic interpretation.
- Matching methods for estimation of Euler equations. Optimal policy perturbations.
- Nonlinearities, Stratification, decomposition, and time-varying impulse responses.
- Panel data structures and inference.
- Staggered, heterogeneous treatment effects in difference-in-difference studies using LPs. Panel data applications.
- Professor: Òscar Jordà

Instructor: Marco del Negro (Federal Reserve Bank of New York)
Dates: 31 August - 4 September 2026
Hours: 9:30 to 13:00 CEST
Format: In person
Practical Classes
There will be some voluntary sessions in the afternoon (from 15:00 to 17:00) led by a teaching assistant. Exact dates will be announced before the beginning of the course.
Intended for
Practitioners, researchers, and academics interested in time-series methods, business cycle analysis, and forecasting.
Prerequisites
A good background in statistics and econometrics will be useful to follow the class, but no familiarity with the Bayesian approach is required, as the course will start with a brief introduction to Bayesian econometrics.
Overview
The course will offer an overview of modern tools in macroeconometrics, ranging from VARs, state-space models (such as time-varying coefficients models, factor models and models with stochastic volatility), dynamic stochastic general equilibrium (DSGE) and DSGE-VARs models, pools, and model averaging. The course will strive to offer enough theory to understand the tools' theoretical underpinnings, why they work, how and when they should be used, and what their limitations are. At the same time, it will emphasize their practical use in macro applications. The course will take a Bayesian perspective, both because this approach has shown itself to be useful in applied macroeconomics and because of its computational advantages relative to the frequentist approach. Monte Carlo methods, which lie behind the recent surge in popularity of the Bayesian approach, will be reviewed.
Topics
- Introduction to Bayesian inference
- VARs
- State-space models and filtering
- An overview of Markov Chain Monte Carlo (MCMC) and Sequential Monte Carlo (SMC) methods
- Time-varying parameter and stochastic volatility models
- DSGEs and DSGE-VARs
- Forecasting with DSGE models
- Policy analysis with misspecified DSGE models
- Model averaging/combination

Instructor: Dario Caldara (Federal Reserve Board)
Dates: 31 August - 4 September 2026
Hours: 15:00 to 18:30 CEST
Format: In person
Intended for
Academic researchers and researchers in policy institutions who are interested in geopolitics and international economics.
Prerequisites
Master level courses in macroeconomics and econometrics.
Overview
This course examines the intersection of economics and geopolitics, focusing on the measurement and economic impact of geopolitical risks. We will explore various approaches to quantifying geopolitical risks and assess how different countries, industries, and sectors are exposed to them. The course will analyze the economic and financial consequences of geopolitical shocks, including their effects on economic activity, financial markets, and cross-border linkages.
A key emphasis will be on the methodological tools used in geoeconomic analysis. Students will develop analytical skills in textual analysis and working with (un)structured large datasets, applying these methods to real-world economic and policy challenges. Additionally, we will examine selected aspects of international policy coordination in response to geopolitical risks, covering monetary, fiscal, and trade policy considerations.
This course will equip participants with the tools needed to analyze the economic impact of geopolitical developments in an increasingly uncertain global landscape.
Topics
- Measurement of geopolitical risks
- Overview of textual analysis in economic applications
- Quantification of the economic effects of geopolitical risks
- Selected topics in international policy coordination
- Professor: Dario Caldara









