How can we improve noisy labor market data?

Data on labor markets is incredibly important to policymakers, from state and federal legislators to the central bank and Treasury Department. But this data is often noisy or incomplete. A new BPEA paper draws on a large body of data sets going back to 1960 to develop a novel approach to studying labor markets that can help fill these gaps. On this episode of the Brookings Podcast on Economic Activity, Brookings Senior Fellow Louise Sheiner talks to Ayşegül Şahin, one of the paper authors, about this new methodology.

Transcript

[music]

EBERLY: I am Jan Eberly, the James R. and Helen D. Russell Professor of Finance at Northwestern University.

STEINSSON: And I’m Jón Steinsson, Marek Professor of Public Policy and Economics at the University of California Berkeley.

EBERLY: We are the co-editors of the Brookings Papers on Economic Activity, a semi-annual academic conference and journal that pairs rigorous research with real-time policy analysis to address the most urgent economic challenges of the day.

STEINSSON: And this is the Brookings Podcast on Economic Activity, where we share conversations with leading economists on the research they do and how it will affect economic policy.

EBERLY: Data on the labor market is essential to policymaking, whether by Congress, the Federal Reserve, the Treasury Department, or any other policymaking body.

But as employees of over a dozen federal statistical agencies will tell you, producing that data is a huge challenging job. A new BPEA paper, “Making Sense of Labor Market Indicators Amid Data Imperfections,” presents a novel methodology that could make that job a little easier while also making the data more resilient to circumstances like government shutdowns.

On today’s episode of the BPEA podcast, Louise Sheiner, senior fellow at Brookings, speaks with the paper authors about their new methodology.

STEINSSON: I’m really excited about this paper because it’s attempting to provide a clear interpretation of what’s going on at any given point in time when it comes to the really large array of different labor market indicators in the United States.

Each one of these is somewhat idiosyncratic and is measured with error, so pooling across different measures can be very valuable. But then sometimes the different measures don’t seem to be telling the same story, or maybe they are, and one just needs to be sure to allow for some combination of a few different narratives occurring at the same time.

That’s what the authors are trying to figure out.

EBERLY: Now let’s turn it over to Louise.

[2:12]

SHEINER: Thank you, Jan and Jon. I’m really happy to be here, and I’m very excited to welcome Ayşegül Şahin to discuss her recent paper. So welcome, Ayşegül.

SAHIN: Thank you, Louise. Thank you for having me.

SHEINER: Okay, so before we get to your paper and your findings, let’s frame the discussion a bit for our listeners.

So why is it so important to understand the health of the labor market and what’s driving it, and why can’t we just look at the unemployment rate or the payroll jobs number?

[2:38]

SAHIN: Well, Louise, as you know, for most households, the major source of income is their labor income. So as a result, if you wanna think about economic wellbeing of individuals in the economy, we really need to understand the labor market really well.

And if you’re a policymaker, you also need to understand the labor market because that’s the channel through which all macroeconomic policies are affecting households. So that’s one of the main reasons why labor market, for me, is the most important market in the economy. And about your question why don’t we just look at the unemployment rate and payroll growth?

Those are indeed the most widely watched, the most important indicators that we always look at. I think there are two main reasons. One is measurement. Imagine the US economy. It’s the biggest economy in the world, and we are trying to understand what’s happening in the economy in real time. Of course, it’s going to be subject to noise, and it’s going to be subject to revisions.

So if you are a policymaker trying to take decisions real time, you will have to take decisions based on these noisy measures. And isn’t it better if you look at more of these measures instead of just looking at two? So that’s the first reason. The second is that the meaning of indicators evolve over time.

Think about 1990s. For many people, it’s the healthiest labor market that we have lived through. But unemployment rate actually around ’97, ’98, was very, very similar to today. But now we are not having that discussion. Why? Because we have an older, more educated labor force, and we think that the natural rate is lower now.

So as a result, just trying to focus on these two indicators will give you a noisy and oftentimes misleading evaluation of the labor market.

SHEINER: That’s very clear and very interesting. Thank you. So, because of that, your paper develops a new method to analyze labor market data.

So, can you tell us exactly about what you do?

[4:47]

SAHIN: Well, as you said we want to look at a lot of data, and in our paper, we are looking at about almost 100 labor market indicators. So a common way we approach this, we want to give particular meanings to different indicators. Unemployment and vacancies are demand. Labor force participation is supply.

Because labor supply, demand, and frictions are three narratives that we like to communicate what’s happening in the labor market. But giving these particular meanings to different indicators is often misleading because the labor market outcomes are determined with this joint forces. So instead of asking what is the unemployment rate telling me today, what we do is that we look at this almost 100 indicators, and we ask what’s the joint evolution of labor supply, labor demand, and frictions that is consistent with these 100 indicators that I’m observing?

So that’s essentially what we do, and that’s the main difference. So, we don’t take a stand on what each indicator is. Instead, we try to provide a joint holistic view of the labor market, and then narrow it down to labor demand, labor supply, and friction so that it’s much easier to communicate for policymakers.

SHEINER: So, can we talk about those three things? Labor demand is presumably what employers want in terms of how many people to hire. Labor supply is how many people wanna work. Tell me a little bit more about frictions.

[6:25]

SAHIN: Of course. Well, when I think about labor demand, the way I try to understand is that employers want to hire, and as a result, employment will increase, but wages will also increase.

But if I have a lot of workers coming in, this is going to put downward pressure on wages, even though employment is going to go up. So, it’s a very simple distinction that’s actually at the back of every textbook model that we have, and policymakers like to think of it. But then going back to how big the US labor market is, we want to find jobs, firms want to hire people.

I live in New Jersey, you live somewhere else. So, there are all these search and matching frictions that prevent matches to happen in the economy. So, these frictions sometimes are worse, sometimes are better. Think about COVID.

COVID made people less likely to want to work in offices. As a result, they wanted to work remotely, right? So that was an adjustment that actually prevented matches from happening. So, labor market frictions is really related to search and matching as well as wage setting in the economy.

We all know that our wages are lower than we want them to be, but then they are not negotiated every week. They are negotiated every year or sometimes twice a year. So, these all affect the balance between labor demand and labor supply.

SHEINER: Okay, so you take all these data, and then you look at the history of the labor market, and then you can tell, oh, here’s periods where demand was high, here’s period where supply was high, and we can kind of characterize the evolution of the labor market in terms of these three underlying labor supply, labor demand frictions, right?

So then, one of the things that you talk about in your paper is the soft landing that we got post-COVID. So, people were worried that we wouldn’t be able to get a soft landing, that we would have to slip into a recession and that didn’t happen. And so tell us what your characterization of the labor market explains about how that happened.

[8:30]

SAHIN: You’re right that there was a lot of worry about a potential recession, especially a big increase in the unemployment rate. The reason is that the job vacancies, which is often interpreted as the measure of labor demand, and which we argue should not be, peaked at 12 million or so. And this was March 2022 when the Federal Reserve started its tightening cycle.

And if you looked at where they were before the COVID recession, they were hovering around 7 million. And the worry was that even if it normalizes back to that 7 million, there will be a lot of unemployed workers. So how do I then think about the labor market? Labor demand will go down so much, there must be a big increase in the unemployment rate.

But it did not. And our method tells us that that’s because there was also a decline in labor supply. The decline in labor supply showed itself as a decline in payroll employment, but at the same time, unemployment didn’t go up because we didn’t need that many jobs or that many vacancies to keep people employed or to be able to absorb new entrants.

But the other thing I want to emphasize is that most people, as I said, think of vacancies as a measure of labor demand. But if you think about 12 million vacancies, and if you think about how many jobs we create typically, there’s a, there’s a big gap. Even in the best month, we have a couple of 100,000 jobs.

The reason is that most vacancies are posted to replace workers. Somebody quits and an opening happens, and then we start hiring people. So turnover increases vacancies. And after COVID, there was a big increase in turnover. Some people called it the Great Resignation. And as a result, firms had to keep posting vacancies.

And part of the decline was not about vacancies that were used to hire unemployed, it was just handling turnover. And our methodology actually shows us this, that labor demand was high, but the record high vacancies was not only due to that, it was because matching frictions were more severe when people were trying to move to different jobs, different locations, et cetera.

SHEINER: Okay, let me rephrase this, see if I get it right. So, we saw this huge increase in vacancies, another word for job openings, people posting saying, “I am looking for someone,” right? And people said, “Oh my God, this is the hottest labor market ever because we have such a huge increase in vacancies, and the Fed’s gotta bring down this, very hot labor market.

And boy, if they do that, aren’t we gonna go into a recession?” And I, I think you’re saying two things, and correct me if I’m wrong, one, the labor demand wasn’t as hot as it appeared just by looking at that job vacancy number relative to history because there was a lot of churn that created people quitting, and therefore every time they quit, there’s a vacancy.

So, if you just looked at vacancies as pure demand, you would be misled, and that’s one of the things that you show in your paper. And second, we were perhaps lucky that at the same time that we needed to bring down what was still a hot labor market there was also a reduction in labor supply that presumably came from aging population immigration,

And that in the next time, if we have such a huge increase in vacancies, you know, can we be assured of a soft landing, or was there some element of luck? Is that something that you could tell from your paper?

[12:07]

SAHIN: It’s a great question. So, if we look at our labor demand measure, the decline is similar to what we have seen at the onset of the Great Recession.

So, it’s really big, but it’s gradual. It took three years for the decline to happen. Why? Because the slowdown mostly happened through job creation margin. We didn’t see a big wave of job destruction, and that’s what we have seen in the Great Recession. So, the gradual decline is key. And that’s why really important for us to use a wide set of indicators because the gradual decline the way we measure it is through the joint evolution of all these indicators.

SHEINER: Great. So, we’re talking about labor supply, and I mentioned immigration. And so we know that immigration has declined very sharply and also that we have an aging population, which means that the labor force is not growing the way it once did. It’s shrinking. The combination of aging population immigration is shrinking.

And that has made it very hard to know how to interpret the payroll numbers when we hear them to know, even without your paper, we all know, like, well, wait a minute. Is that a big number or a small number? What is the breakeven number?

So, can you tell us a little bit about what this breakeven number is, why it’s important, and what your paper tells us about what it might be?

[13:34]

SAHIN: Great question. Breakeven employment rate is a very appealing concept for policymakers because we want to think about the health of the labor market, the flow of employment that is just enough to absorb new entrants to the labor market. So as a result, unemployment rate will be steady. The labor market is going to be kind of in a neutral, steady situation.

We’re just gonna create enough jobs so that we don’t have an increase or decrease in the unemployment rate. The concept is very appealing, but of course, practice is more difficult because, as you have mentioned, changes in population growth, age structure, immigration fluctuations, as well as female labor force participation all these factors will affect the breakeven employment rate.

And currently, we have seen big upward and downward fluctuations, downward fluctuations in immigration. As a result, the estimated breakeven coming from different sources changed very rapidly. And in our paper, we try to provide different scenarios. This is population growth possibilities, this is labor force participation, and this is the range that we expect.

And using CBO’s recent population growth estimates and our own labor force participation projections, we are around twenty-five to one hundred K a month. But based on incoming data as of May 2026, I would expect that we are actually on the lower side of this range, possibly between twenty-five to fifty.

And of course, it’s very hard to think about a healthy labor market where the number of jobs is between twenty-five and fifty, but this is the reality of population growth slowdown. And if you look at different countries, we see that countries where population is growing really fast, employment growth fast as well.

But in countries like Japan and Korea where aging is more advanced compared to the U.S. economy, we see stagnant employment numbers. So, this is really a big change for the U.S. economy where population growth rate was much higher before.

SHEINER: So that means when people are listening to the reports about the employment report that just came out, and they’re like, “50,000 jobs were created,” a few years ago we would’ve thought, “Oh, wow, that’s a terrible labor market.”

But right now we’d say, “That’s a pretty good labor market.” Is that how to think about it?

[16:14]

SAHIN: Yes. And then the other thing of course, is this is the level which is required to keep the unemployment rate constant. But after recessions, there’s a big hole. The unemployment starts going down, then we will get higher numbers because we are still closing the gap that opened during the recession, and that explains the higher job numbers that we have seen when unemployment rate was still going down. But we think that there is a natural sort of a minimum unemployment rate for each economy depending on the age structure, skills, industry composition, as well as turnover, and my best estimate of this number for the U.S. economy is mid-threes. We’re not going to see unemployment going down to 2% for the U.S. economy where it’s feasible for Japan, for example.

SHEINER: So, you think we still have some ways to go. So, we think we could get a lower unemployment rate than we have now ’cause we’re now in, like, what, four and a half or something like that?

[17:16]

SAHIN: Yeah, I think 3.5, and that’s… I think the minimum was 3.4. That kind of the, what I would refer to as the minimum unemployment rate that will prevail if it’s only frictional, if it’s handling all these frictions that we talked about.

SHEINER: Okay. Looking back and thinking about your paper, did you intend this to be a way of like, “Let’s look at the history of labor markets and understand them,” or are you really proposing this as a tool that policymakers can be using in real time to do better on specifically monetary policy, which is something that’s sort of constantly responding to data?

[17:57]

SAHIN: We really wanted to provide a new tool, which is disciplined with very rich data. But still distilled to clear concepts for communication. Labor supply, labor demand, and frictions in the economy. But of course, macroeconomists love the past because it provides us a laboratory to be able to evaluate how well our method is doing.

And that’s why we relied on a very long time series, as far as we could with rich dataset, and try to revisit the episodes that we now have a lot of information so that we could build a forward-looking tool. And that’s our hope that this will be a useful tool for policymakers who have the tough task of setting policy in real time with noisy and incomplete data.

SHEINER: So, let’s talk about some of those datasets that you looked at.

So, we’ve obviously talked about the unemployment rate and the payroll jobs numbers, which I presume those two are indicators are in your dataset. and we talked about vacancies. And so, what other kinds of indicators are actually in your datasets, and how many are private data, how many are public data?

[19:11]

SAHIN: So, I will have to check this, but my guess is about two-thirds probably are coming from official statistics, while the one-third, they’re either surveys or privately provided data. So, we rely on a really big set of labor market data. We constrain ourselves to labor market, but hopefully we will provide an extension where we could bring in non-labor market data to be able to analyze the whole economy as well.

I mean, we are really happy that the U.S. statistical infrastructure is so rich, and this wasn’t the case always. I always talk about the Great Depression. During that time, policymakers, households, they were sure that they were living in a terrible labor market, but we lacked the statistical infrastructure to measure it real time.

And there was a patchwork of data availability coming from the manufacturing sector, coming from banks, some newspaper job listings, et cetera. And it’s not a coincidence that the statistical infrastructure that we rely on today was built after the Great Depression. And I think we really don’t want to find ourselves in this situation again.

And our method as well as methods that other macroeconomists are developing might help to fill in the gaps if you are missing one month or a couple of observations. But there is really no substitute for consistent and timely labor market data to be able to understand what’s happening. And I can’t emphasize enough how important that is.

SHEINER: Yeah. My next question was gonna be right about that, so I’m glad you talked about it.

So, we all know that the statistical agencies are under a lot of budgetary pressure. And, you know, it can be hard for people to understand the value of what they do. I don’t know if people know, but the US is really the gold standard for data both in terms of the breadth of the data and the respect that people have for official statistics for the very, very good people working on these things to make sure they’re representative to make sure they’re high quality, and to make sure that they’re not influenced by politics.

And so, I think right now we’re all a little bit worried about our statistical system because of the budgetary pressures, because some people have thought maybe some were worried a little bit more that maybe it’d be more politically influenced. So, given all the work that you’ve done, like how much of a loss would it be if the statistical agencies had to drop some statistics or couldn’t field surveys that were large enough, so increase the noise that were in the data?

How do we think about those costs in terms of like monetary policy?

[22:01]

SAHIN: Especially for monetary policy where decisions are taken eight times a year, maybe sometimes even more, it’s really important to have timely statistics, and it’s very important to be able to compare those statistics with what happened in the past.

Because for policymakers, the past is the best indicator that you could test yourselves, because this is not an experimental science. We can’t implement policies and see which one is going to work on the U.S. economy, and that’s why official data sources are very important. Like the unemployment rate goes back all the way.

We have almost 80 years of data on that. I love private data sources, real time everything we could get, ad hoc surveys that we could implement, but we don’t have the same consistency over time, and we lose this comparison frame. But obviously, as the economy evolves, our data collection efforts will also have to evolve.

And this requires more investment, not less, because now AI is very important, and we don’t want AI-generated surveys. So, what do we need to do? We need to invest in surveys because a lot of people would like to talk to someone when they are answering questions, and that’s the highest quality survey, but it’s more expensive.

So instead of actually thinking that we cannot capture it, we have to go back to it and modernize it.

SHEINER: I couldn’t agree with you more, and I think on that note we’ll end. I thank you so much for a really great discussion, a fascinating paper, and this has been really fun. Thank you.

[23:49]

SAHIN: It was great. Thank you.

[music]

STEINSSON: Once again, I’m Jon Steinsen.

EBERLY: And I’m Jan Eberly.

STEINSSON: And this has been season eight of the Brookings Podcast on Economic Activity. Thanks to our guests for this great conversation, as well as a big thanks to all the BPEA authors and Brookings experts who joined the podcast this season. Be sure to subscribe to get notifications when we launch season nine in October 2026.

EBERLY: The Brookings Podcast on Economic Activity is produced by the Brookings Podcast Network. Learn more about this and our other podcasts at Brookings dot edu slash podcasts. Send feedback to podcasts at Brookings dot edu and find out more about the Brookings Papers on Economic Activity online at Brookings dot edu slash B-P-E-A.

STEINSSON: Thanks to the team that makes this podcast possible. Fred Dews, supervising producer Chris Miller, co-producer, Gaston Reboredo, co-producer and audio engineer. Show Art was designed by Katie Meris. And promotional support comes from our colleagues in Brookings Communications.

Participants

The Brookings Institution is committed to quality, independence, and impact.
We are supported by a diverse array of funders. In line with our values and policies, each Brookings publication represents the sole views of its author(s).

How can we improve noisy labor market data?

Subscribe to the Economic Studies Bulletin

How can we improve noisy labor market data?

Transcript