How to Write an Abstract for a Scientific Paper
- Chemical Laws
- Periodic Table
- Projects & Experiments
- Scientific Method
- Biochemistry
- Physical Chemistry
- Medical Chemistry
- Chemistry In Everyday Life
- Famous Chemists
- Activities for Kids
- Abbreviations & Acronyms
- Weather & Climate
- Ph.D., Biomedical Sciences, University of Tennessee at Knoxville
- B.A., Physics and Mathematics, Hastings College
If you're preparing a research paper or grant proposal, you'll need to know how to write an abstract. Here's a look at what an abstract is and how to write one.
An abstract is a concise summary of an experiment or research project. It should be brief -- typically under 200 words. The purpose of the abstract is to summarize the research paper by stating the purpose of the research, the experimental method, the findings, and the conclusions.
- How to Write an Abstract
The format you'll use for the abstract depends on its purpose. If you're writing for a specific publication or a class assignment, you'll probably need to follow specific guidelines. If there isn't a required format, you'll need to choose from one of two possible types of abstracts.

Informational Abstracts
An informational abstract is a type of abstract used to communicate an experiment or lab report .
- An informational abstract is like a mini-paper. Its length ranges from a paragraph to 1 to 2 pages, depending on the scope of the report. Aim for less than 10% the length of the full report.
- Summarize all aspects of the report, including purpose, method, results, conclusions, and recommendations. There are no graphs, charts, tables, or images in an abstract. Similarly, an abstract does not include a bibliography or references.
- Highlight important discoveries or anomalies. It's okay if the experiment did not go as planned and necessary to state the outcome in the abstract.
Here is a good format to follow, in order, when writing an informational abstract. Each section is a sentence or two long:
- Motivation or Purpose: State why the subject is important or why anyone should care about the experiment and its results.
- Problem: State the hypothesis of the experiment or describe the problem you are trying to solve.
- Method: How did you test the hypothesis or try to solve the problem?
- Results: What was the outcome of the study? Did you support or reject a hypothesis? Did you solve a problem? How close were the results to what you expected? State-specific numbers.
- Conclusions: What is the significance of your findings? Do the results lead to an increase in knowledge, a solution that may be applied to other problems, etc.?
Need examples? The abstracts at PubMed.gov (National Institutes of Health database) are informational abstracts. A random example is this abstract on the effect of coffee consumption on Acute Coronary Syndrome .
Descriptive Abstracts
A descriptive abstract is an extremely brief description of the contents of a report. Its purpose is to tell the reader what to expect from the full paper.
- A descriptive abstract is very short, typically less than 100 words.
- Tells the reader what the report contains, but doesn't go into detail.
- It briefly summarizes the purpose and experimental method, but not the results or conclusions. Basically, say why and how the study was made, but don't go into findings.
Tips for Writing a Good Abstract
- Write the paper before writing the abstract. You might be tempted to start with the abstract since it comes between the title page and the paper, but it's much easier to summarize a paper or report after it has been completed.
- Write in the third person. Replace phrases like "I found" or "we examined" with phrases like "it was determined" or "this paper provides" or "the investigators found".
- Write the abstract and then pare it down to meet the word limit. In some cases, a long abstract will result in automatic rejection for publication or a grade!
- Think of keywords and phrases a person looking for your work might use or enter into a search engine. Include those words in your abstract. Even if the paper won't be published, this is a good habit to develop.
- All information in the abstract must be covered in the body of the paper. Don't put a fact in the abstract that isn't described in the report.
- Proof-read the abstract for typos, spelling mistakes, and punctuation errors.
- Abstract Writing for Sociology
- How to Write a Science Fair Project Report
- How to Format a Biology Lab Report
- How to Write a Lab Report
- Six Steps of the Scientific Method
- Science Lab Report Template - Fill in the Blanks
- How to Write a Research Paper That Earns an A
- An Introduction to Academic Writing
- Writing an Annotated Bibliography for a Paper
- Null Hypothesis Definition and Examples
- Scientific Method Flow Chart
- How To Design a Science Fair Experiment
- Writing Patent Application Abstracts
- How to Write a Great Book Report
- Writing a Paper about an Environmental Issue
By clicking “Accept All Cookies”, you agree to the storing of cookies on your device to enhance site navigation, analyze site usage, and assist in our marketing efforts.
Have a language expert improve your writing
Run a free plagiarism check in 10 minutes, generate accurate citations for free.
- Knowledge Base
- Dissertation
- How to Write an Abstract | Steps & Examples
How to Write an Abstract | Steps & Examples
Published on February 28, 2019 by Shona McCombes . Revised on July 18, 2023 by Eoghan Ryan.

An abstract is a short summary of a longer work (such as a thesis , dissertation or research paper ). The abstract concisely reports the aims and outcomes of your research, so that readers know exactly what your paper is about.
Although the structure may vary slightly depending on your discipline, your abstract should describe the purpose of your work, the methods you’ve used, and the conclusions you’ve drawn.
One common way to structure your abstract is to use the IMRaD structure. This stands for:
- Introduction
Abstracts are usually around 100–300 words, but there’s often a strict word limit, so make sure to check the relevant requirements.
In a dissertation or thesis , include the abstract on a separate page, after the title page and acknowledgements but before the table of contents .
Table of contents
Abstract example, when to write an abstract, step 1: introduction, step 2: methods, step 3: results, step 4: discussion, tips for writing an abstract, other interesting articles, frequently asked questions about abstracts.
Hover over the different parts of the abstract to see how it is constructed.
This paper examines the role of silent movies as a mode of shared experience in the US during the early twentieth century. At this time, high immigration rates resulted in a significant percentage of non-English-speaking citizens. These immigrants faced numerous economic and social obstacles, including exclusion from public entertainment and modes of discourse (newspapers, theater, radio).
Incorporating evidence from reviews, personal correspondence, and diaries, this study demonstrates that silent films were an affordable and inclusive source of entertainment. It argues for the accessible economic and representational nature of early cinema. These concerns are particularly evident in the low price of admission and in the democratic nature of the actors’ exaggerated gestures, which allowed the plots and action to be easily grasped by a diverse audience despite language barriers.
Keywords: silent movies, immigration, public discourse, entertainment, early cinema, language barriers.
Here's why students love Scribbr's proofreading services
Discover proofreading & editing
You will almost always have to include an abstract when:
- Completing a thesis or dissertation
- Submitting a research paper to an academic journal
- Writing a book or research proposal
- Applying for research grants
It’s easiest to write your abstract last, right before the proofreading stage, because it’s a summary of the work you’ve already done. Your abstract should:
- Be a self-contained text, not an excerpt from your paper
- Be fully understandable on its own
- Reflect the structure of your larger work
Start by clearly defining the purpose of your research. What practical or theoretical problem does the research respond to, or what research question did you aim to answer?
You can include some brief context on the social or academic relevance of your dissertation topic , but don’t go into detailed background information. If your abstract uses specialized terms that would be unfamiliar to the average academic reader or that have various different meanings, give a concise definition.
After identifying the problem, state the objective of your research. Use verbs like “investigate,” “test,” “analyze,” or “evaluate” to describe exactly what you set out to do.
This part of the abstract can be written in the present or past simple tense but should never refer to the future, as the research is already complete.
- This study will investigate the relationship between coffee consumption and productivity.
- This study investigates the relationship between coffee consumption and productivity.
Next, indicate the research methods that you used to answer your question. This part should be a straightforward description of what you did in one or two sentences. It is usually written in the past simple tense, as it refers to completed actions.
- Structured interviews will be conducted with 25 participants.
- Structured interviews were conducted with 25 participants.
Don’t evaluate validity or obstacles here — the goal is not to give an account of the methodology’s strengths and weaknesses, but to give the reader a quick insight into the overall approach and procedures you used.
Prevent plagiarism. Run a free check.
Next, summarize the main research results . This part of the abstract can be in the present or past simple tense.
- Our analysis has shown a strong correlation between coffee consumption and productivity.
- Our analysis shows a strong correlation between coffee consumption and productivity.
- Our analysis showed a strong correlation between coffee consumption and productivity.
Depending on how long and complex your research is, you may not be able to include all results here. Try to highlight only the most important findings that will allow the reader to understand your conclusions.
Finally, you should discuss the main conclusions of your research : what is your answer to the problem or question? The reader should finish with a clear understanding of the central point that your research has proved or argued. Conclusions are usually written in the present simple tense.
- We concluded that coffee consumption increases productivity.
- We conclude that coffee consumption increases productivity.
If there are important limitations to your research (for example, related to your sample size or methods), you should mention them briefly in the abstract. This allows the reader to accurately assess the credibility and generalizability of your research.
If your aim was to solve a practical problem, your discussion might include recommendations for implementation. If relevant, you can briefly make suggestions for further research.
If your paper will be published, you might have to add a list of keywords at the end of the abstract. These keywords should reference the most important elements of the research to help potential readers find your paper during their own literature searches.
Be aware that some publication manuals, such as APA Style , have specific formatting requirements for these keywords.
It can be a real challenge to condense your whole work into just a couple of hundred words, but the abstract will be the first (and sometimes only) part that people read, so it’s important to get it right. These strategies can help you get started.
Read other abstracts
The best way to learn the conventions of writing an abstract in your discipline is to read other people’s. You probably already read lots of journal article abstracts while conducting your literature review —try using them as a framework for structure and style.
You can also find lots of dissertation abstract examples in thesis and dissertation databases .
Reverse outline
Not all abstracts will contain precisely the same elements. For longer works, you can write your abstract through a process of reverse outlining.
For each chapter or section, list keywords and draft one to two sentences that summarize the central point or argument. This will give you a framework of your abstract’s structure. Next, revise the sentences to make connections and show how the argument develops.
Write clearly and concisely
A good abstract is short but impactful, so make sure every word counts. Each sentence should clearly communicate one main point.
To keep your abstract or summary short and clear:
- Avoid passive sentences: Passive constructions are often unnecessarily long. You can easily make them shorter and clearer by using the active voice.
- Avoid long sentences: Substitute longer expressions for concise expressions or single words (e.g., “In order to” for “To”).
- Avoid obscure jargon: The abstract should be understandable to readers who are not familiar with your topic.
- Avoid repetition and filler words: Replace nouns with pronouns when possible and eliminate unnecessary words.
- Avoid detailed descriptions: An abstract is not expected to provide detailed definitions, background information, or discussions of other scholars’ work. Instead, include this information in the body of your thesis or paper.
If you’re struggling to edit down to the required length, you can get help from expert editors with Scribbr’s professional proofreading services or use the paraphrasing tool .
Check your formatting
If you are writing a thesis or dissertation or submitting to a journal, there are often specific formatting requirements for the abstract—make sure to check the guidelines and format your work correctly. For APA research papers you can follow the APA abstract format .
Checklist: Abstract
The word count is within the required length, or a maximum of one page.
The abstract appears after the title page and acknowledgements and before the table of contents .
I have clearly stated my research problem and objectives.
I have briefly described my methodology .
I have summarized the most important results .
I have stated my main conclusions .
I have mentioned any important limitations and recommendations.
The abstract can be understood by someone without prior knowledge of the topic.
You've written a great abstract! Use the other checklists to continue improving your thesis or dissertation.
If you want to know more about AI for academic writing, AI tools, or research bias, make sure to check out some of our other articles with explanations and examples or go directly to our tools!
Research bias
- Anchoring bias
- Halo effect
- The Baader–Meinhof phenomenon
- The placebo effect
- Nonresponse bias
- Deep learning
- Generative AI
- Machine learning
- Reinforcement learning
- Supervised vs. unsupervised learning
(AI) Tools
- Grammar Checker
- Paraphrasing Tool
- Text Summarizer
- AI Detector
- Plagiarism Checker
- Citation Generator
An abstract is a concise summary of an academic text (such as a journal article or dissertation ). It serves two main purposes:
- To help potential readers determine the relevance of your paper for their own research.
- To communicate your key findings to those who don’t have time to read the whole paper.
Abstracts are often indexed along with keywords on academic databases, so they make your work more easily findable. Since the abstract is the first thing any reader sees, it’s important that it clearly and accurately summarizes the contents of your paper.
An abstract for a thesis or dissertation is usually around 200–300 words. There’s often a strict word limit, so make sure to check your university’s requirements.
The abstract is the very last thing you write. You should only write it after your research is complete, so that you can accurately summarize the entirety of your thesis , dissertation or research paper .
Avoid citing sources in your abstract . There are two reasons for this:
- The abstract should focus on your original research, not on the work of others.
- The abstract should be self-contained and fully understandable without reference to other sources.
There are some circumstances where you might need to mention other sources in an abstract: for example, if your research responds directly to another study or focuses on the work of a single theorist. In general, though, don’t include citations unless absolutely necessary.
The abstract appears on its own page in the thesis or dissertation , after the title page and acknowledgements but before the table of contents .
Cite this Scribbr article
If you want to cite this source, you can copy and paste the citation or click the “Cite this Scribbr article” button to automatically add the citation to our free Citation Generator.
McCombes, S. (2023, July 18). How to Write an Abstract | Steps & Examples. Scribbr. Retrieved December 11, 2023, from https://www.scribbr.com/dissertation/abstract/
Is this article helpful?
Shona McCombes
Other students also liked, how to write a thesis or dissertation introduction, shorten your abstract or summary, how to write a literature review | guide, examples, & templates, what is your plagiarism score.

An official website of the United States government
The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.
The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.
- Publications
- Account settings
- Advanced Search
- Journal List
- J Indian Prosthodont Soc
- v.13(3); 2013 Sep
How to Write a Scientific Abstract
Suhasini nagda.
Nair Hospital Dental College, Mumbai, India
Scientific publications are an important source of information and knowledge in Academics, Research and development. When articles are submitted for publication, the 1st part that comes across and causes an impact on the minds of the readers is the abstract. It is a concise summary of the paper and must convey the right message. It is a quick overview of the entire paper and giving a gist of the paper and also gives us and insight into whether the paper fulfills the expectations of the reader.
Abstracts are significant parts of academic assignments and research papers. The abstract is written at the end and by this time, the author has a clear picture regarding the findings and conclusions and hence the right message can be put forward.
Types of Scientific Abstracts [ 1 ]
- Descriptive
- Informative
- Semi-structured
- Non structured
Descriptive Abstracts
This type of abstract is usually very short (50–100 words). Most descriptive abstracts have certain key parts in common. They are:
□ Background
□ Purpose
□ Particular interest/focus of paper
□ Overview of contents (not always included)
These abstracts are inconvenient in that, by not including a detailed presentation of the results, it is necessary to have access to the complete article ; they may present the results via a phrase synthesizing them, without contributing numerical or statistical data. Ultimately, these guide readers on the nature of the contents of the article, but it is necessary to read the whole manuscript to know further details [ 1 ].
Informative Abstracts
From these abstracts, you must get the essence of what your report is about, usually in about 200 words. Most informative abstracts also have key parts in common. Each of these parts might consist of 1–2 sentences. The parts include:
□ Aim or purpose of research
□ Method used
□ Findings/results
□ Conclusion
The abstracts provide accurate data on the contents of the work, especially on the results section. Informative abstracts are short scientific productions, since they follow the IMRaD structure [ 2 ] and can in fact replace the whole text, because readers extract from these the most valuable information and in many instances it is not necessary to read the complete text.
Recommendations by the CONSORT [ 3 ] declaration, in its adaptation for abstracts, offer a guide for the elaboration of an abstract of a clinical trial in structured and informative manner, using up to 400 words and briefly including the Title, Methods (participants, interventions, objective, outcomes, randomization, blind tests), Results (number of randomizations, recruitment, number of analyses, outcome, important adverse effects), and Conclusions, registry of the clinical trial and conflict of interests.
Structured Abstracts
A structured abstract has a paragraph for each section: Introduction, Materials and Methods, Results, and Conclusion (it may even include paragraphs for the objectives or other sections). This type of presentation is often required for informative abstracts. The CONSORT [ 3 ] declaration suggests the presentation of clinical trials with structured abstracts . Structuring an abstract permits its informative development
Semi-structured Abstract
A semi-structured abstract is written in only one paragraph, where each sentence corresponds to a section . All the sections of the article are present as in the structured abstract [ 1 ].
Non-structured Abstract
When the abstract does not present divisions between each section , and it may not even present any of them, it is a non-structured abstract. The sentences are included in a sole paragraph. This type of presentation is ideal for descriptive abstracts [ 1 ].
Key Steps to Plan Writing an Abstract [ 4 ]
- Introduction—what is the topic?
- Statement of purpose?
- Summarize why have other studies not tackled similar research questions?
- How has the research question been tackled?
- How was the research done?
- What is the key impact of the research?
Errors in the Creation of an Abstract [ 1 ]
- The abstract of an article should contribute to readers the most relevant aspects of each part of the whole manuscript, maintaining a balance between excessive detail and a vague contribution of information.
- The abstract should be written by adequately selecting the words and sentences to accomplish coherent, clear, and concise contents.
- A common defect is including adequate information like abbreviations, excessive acronyms, bibliographic references, or figures.
- The length of an abstract will be determined by the instructions to authors by each journal; an excessively lengthy abstract is the most frequent error.
- Sections should maintain coherence and order and that the conclusions must be substantiated by the results revealed and respond to the objectives proposed.
- Frequently, abstracts have poorly defined objectives, excessive numerical data and statistical results, and conclusions not based on results presented.
In short, a good abstract is one that:
- Is coherent and concise
- Covers all the essential academic elements of the full-length paper
- Contains no information not included in the paper;
- Is written in plain English and is understandable to a wider audience and discipline-specific audience;
- Uses passive structures in order to report on findings
- Uses the language of the original paper, in a more simplified form
- Usually does not include any referencing; and
- In publications such as journals, it is found at the beginning of the text, but in academic assignments, it is placed on a separate preliminary page.
A good abstract usually ensures a good article, but a bad abstract often points towards an undesirable article. Scientific abstracts are a challenge to write and for the success of our publications, careful and planned writing of the abstract is absolutely essential.

- Langson Library
- Science Library
- Grunigen Medical Library
- Law Library
- Connect From Off-Campus
- Accessibility
- Gateway Study Center

Email this link
Writing a scientific paper.
- Writing a lab report
What is an abstract?
What is a "good" abstract, techniques to write an abstract, "abstract checklist" from: how to write a good scientific paper. chris a. mack. spie. 2018..
- INTRODUCTION
- LITERATURE CITED
- Bibliography of guides to scientific writing and presenting
- Peer Review
- Presentations
- Lab Report Writing Guides on the Web
There are as many kinds as abstracts as there are types of research papers. The classic abstract is usually a "Informative" abstract. This kind of abstract communicates compressed information and include the purpose, methods, and scope of the article. They are usually short (250 words or less) and allow the reader to decide whether they want to read the article.
The goal is to communicate:
- What was done?
- Why was it done?
- How was it done?
- What was found?
- What is the significance of the findings?
- Self contained. Uses 1 or more well developed paragraphs
- Uses introduction/body/conclusion structure
- Presents purpose, results, conclusions and recommendations in that order
- Adds no new information
- Is understandable to a wide audience
- Write the abstract last
- Reread the article looking specifically for the main parts: Purpose, methods, scope, results, conclusions, and recommendations
- Write a first rough draft without looking at the original article
- Edit your draft by correcting organization, improving transitions, dropping unnecessary information and words, and adding important information you left out
The abstract should be a concise (200 words or less), standalone summary of the paper, with 1–2 sentences on each of these topics:
- Background: What issues led to this work? What is the environment that makes this work interesting or important?
- Aim: What were the goals of this work? What gap is being filled?
- Approach: What went into trying to achieve the aims (e.g., experimental method, simulation approach, theoretical approach, combinations of these, etc.)? What was actually done?
- Results: What were the main results of the study (including numbers, if appropriate)?
- Conclusions: What were the main conclusions? Why are the results important? Where will they lead?
The abstract should be written for the audience of this journal: do not assume too much or too little background with the topic.
Ensure that all of the information found in the abstract also can be found in the body of the paper.
Ensure that the important information of the paper is found in the abstract.
Avoid: using the first paragraph of the introduction as an abstract; citations in the abstract; acronyms (but if used, spell them out); referring to figures or tables from the body of the paper; use of the first person; use of words like “new” or “novel,” or phrases like “in this paper,” “we report,” or “will be discussed.”
- << Previous: TITLE
- Next: INTRODUCTION >>
- Last Updated: Aug 4, 2023 9:33 AM
- URL: https://guides.lib.uci.edu/scientificwriting
Off-campus? Please use the Software VPN and choose the group UCIFull to access licensed content. For more information, please Click here
Software VPN is not available for guests, so they may not have access to some content when connecting from off-campus.
We use cookies and similar technologies to improve your website experience and help us understand how you use our website. By continuing to use this website, you consent to the usage of cookies. Learn more about our Privacy Statement and Cookie Policy .

- Hours and Locations
Abstracts in Scientific Research Papers (IMRaD)
Abstracts in Scientific Research Papers (IMRaD)
An effective abstract in an IMRaD* report provides the reader with a concise, informative summary of the entire paper . An IMRaD abstract should stand on its own; it is not a part of the introduction. The abstract should clearly preview the paper’s content, allowing the reader to decide if the information is relevant to them and whether they should read the whole report. Abstracts should contain keywords and phrases that allow for easy searching online.
An IMRaD abstract is typically a single paragraph of 150-300 words. However, abstract conventions can vary by discipline or publication venue (e.g., journal). Because the IMRaD abstract is a concise summary of the whole paper, writers draft their abstracts after they have written a full draft of their IMRaD report.
* IMRaD refers to reports with the structure Introduction-Method-Results-Discussion used in empirical research in natural and social sciences. Please refer to the Writing Center quick guide “Writing an IMRaD Report” for more explanations.
Common Moves in Abstracts
An abstract contains elements of all sections of the IMRaD report: Introduction, Method, Results, and Discussion. The table below explains
The table is adapted from Doro, K. (2013). The rhetoric structure of research article abstracts in English studies journals. Prague Journal of English Studies , 2(1), 125-26. https://core.ac.uk/download/pdf/80769215.pdf and Samraj, B. (2005). An exploration of a genre set: Research article abstracts and introduction in two disciplines. English for Specific Purposes, 24, 141-56.
Sample Abstracts
Simple text = Establishing the context
Italics = Stating the purpose/introducing the study
Underlined = Describing methodology
Bold = Presenting the results
Bolded Italics = Discussing the findings
Teachers’ social support and classroom management are related to secondary students’ achievement, domain-specific interest, and self-concept. However, little is known about whether social support and classroom management shape secondary students’ general school adjustment beyond these domain-specific outcomes. To investigate this question, we drew on data from a large longitudinal research project (N = 5,607 secondary students, N = 227 classes). We applied student and teacher ratings of social support and classroom management to investigate their perspective-specific validities for predicting student outcomes. To measure students’ school adjustment, we assessed achievement as a domain-specific indicator and school satisfaction, truancy, and self-esteem as more general aspects . Multilevel confirmatory factor analyses showed that both teachers and students distinguished between social support and classroom management. Teacher and student ratings of classroom management largely converged, whereas their perceptions of social support were not statistically significantly associated with one another. In multilevel structural equation modeling, both perspectives uniquely predicted students’ school adjustment: Student-rated social support was linked to all outcomes at the student level and to school satisfaction and self-esteem at the class level. Classroom management showed only weak associations with outcomes at the student level, but at the class level, student-rated classroom management was related to truancy and teacher-rated classroom management was linked to school satisfaction and student achievement. These findings highlight the important role of teachers in students’ general school adjustment and show the benefit of considering different perspectives and levels of analyses. (PsycINFO Database Record (c) 2018 APA, all rights reserved).
From Aldrup, K., Klusmann, U., Lüdtke, O., Göllner, R., & Trautwein, U. (2018). Social support and classroom management are related to secondary students’ general school adjustment: A multilevel structural equation model using student and teacher ratings. Journal of Educational Psychology, 110 (8), 1066–1083.
We present an algorithm for simultaneous face detection, landmarks localization, pose estimation and gender recognition using deep convolutional neural networks (CNN). The proposed method called, HyperFace, fuses the intermediate layers of a deep CNN using a separate CNN followed by a multi-task learning algorithm that operates on the fused features. It exploits the synergy among the tasks which boosts up their individual performances. Additionally, we propose two variants of HyperFace: (1) HyperFace-ResNet that builds on the ResNet-101 model and achieves significant improvement in performance, and (2) Fast-HyperFace that uses a high recall fast face detector for generating region proposals to improve the speed of the algorithm. Extensive experiments show that the proposed models are able to capture both global and local information in faces and performs significantly better than many competitive algorithms for each of these four tasks.
From Ranjan, R., Patel, V. M., and Chellappa, R. (2019). Hyperface: a deep multi-task learning framework for face detection, landmark localization, pose estimation, and gender recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence , 41(1), 121-135.
As can be seen from the two examples above, different disciplines and journals use introduction moves differently. Therefore, before writing an abstract for your study, you should read published IMRaD abstracts from your field to familiarize yourself with the conventions and expectations of your discipline.
Common Problems to Avoid in IMRaD Abstracts
- The abstract provides a statement of what the paper will ask or explore rather than what it found:
X This report examines the causes of oversleeping. (What did it find out about these causes?) √ Individuals oversleep because they go to bed too late, forget to set their alarms, and keep their rooms dark.
- The abstract provides general categories rather than specific details in the findings:
X The study draws conclusions about which variables are most important in choosing a movie theater. (What, specifically, are these variables?)
√ The study concludes that the most important variables in choosing a movie theater are comfortable seats and high-quality popcorn.
Activity to Help You Prepare for Writing an IMRaD Abstract
To prepare for writing IMRaD abstracts, find several IMRaD articles from journals in your discipline, and use the following questions to analyze the abstracts:
- How many paragraphs do the abstracts consist of? How many words do they contain?
- Which moves are present in these abstracts? Which are absent?
- Do the authors include citations in the abstracts? If they do, in which moves and for which purpuses?
- Which verb tenses are used in each move?
- Do the authors include numbers and statistics? If they do, in which moves?
- How many keywords are included at the end of the abstract? How do you think the authors decided which keyword to include?
Last updated 9/21/2020
- USC Libraries
- Research Guides
Organizing Your Social Sciences Research Paper
- 3. The Abstract
- Purpose of Guide
- Design Flaws to Avoid
- Independent and Dependent Variables
- Glossary of Research Terms
- Reading Research Effectively
- Narrowing a Topic Idea
- Broadening a Topic Idea
- Extending the Timeliness of a Topic Idea
- Academic Writing Style
- Choosing a Title
- Making an Outline
- Paragraph Development
- Research Process Video Series
- Executive Summary
- The C.A.R.S. Model
- Background Information
- The Research Problem/Question
- Theoretical Framework
- Citation Tracking
- Content Alert Services
- Evaluating Sources
- Primary Sources
- Secondary Sources
- Tiertiary Sources
- Scholarly vs. Popular Publications
- Qualitative Methods
- Quantitative Methods
- Insiderness
- Using Non-Textual Elements
- Limitations of the Study
- Common Grammar Mistakes
- Writing Concisely
- Avoiding Plagiarism
- Footnotes or Endnotes?
- Further Readings
- Generative AI and Writing
- USC Libraries Tutorials and Other Guides
- Bibliography
An abstract summarizes, usually in one paragraph of 300 words or less, the major aspects of the entire paper in a prescribed sequence that includes: 1) the overall purpose of the study and the research problem(s) you investigated; 2) the basic design of the study; 3) major findings or trends found as a result of your analysis; and, 4) a brief summary of your interpretations and conclusions.
Writing an Abstract. The Writing Center. Clarion University, 2009; Writing an Abstract for Your Research Paper. The Writing Center, University of Wisconsin, Madison.
Importance of a Good Abstract
Sometimes your professor will ask you to include an abstract, or general summary of your work, with your research paper. The abstract allows you to elaborate upon each major aspect of the paper and helps readers decide whether they want to read the rest of the paper. Therefore, enough key information [e.g., summary results, observations, trends, etc.] must be included to make the abstract useful to someone who may want to examine your work.
How do you know when you have enough information in your abstract? A simple rule-of-thumb is to imagine that you are another researcher doing a similar study. Then ask yourself: if your abstract was the only part of the paper you could access, would you be happy with the amount of information presented there? Does it tell the whole story about your study? If the answer is "no" then the abstract likely needs to be revised.
How to Write a Research Abstract. Office of Undergraduate Research. University of Kentucky; Staiger, David L. “What Today’s Students Need to Know about Writing Abstracts.” International Journal of Business Communication January 3 (1966): 29-33; Swales, John M. and Christine B. Feak. Abstracts and the Writing of Abstracts . Ann Arbor, MI: University of Michigan Press, 2009.
Structure and Writing Style
I. Types of Abstracts
To begin, you need to determine which type of abstract you should include with your paper. There are four general types.
Critical Abstract A critical abstract provides, in addition to describing main findings and information, a judgment or comment about the study’s validity, reliability, or completeness. The researcher evaluates the paper and often compares it with other works on the same subject. Critical abstracts are generally 400-500 words in length due to the additional interpretive commentary. These types of abstracts are used infrequently.
Descriptive Abstract A descriptive abstract indicates the type of information found in the work. It makes no judgments about the work, nor does it provide results or conclusions of the research. It does incorporate key words found in the text and may include the purpose, methods, and scope of the research. Essentially, the descriptive abstract only describes the work being summarized. Some researchers consider it an outline of the work, rather than a summary. Descriptive abstracts are usually very short, 100 words or less. Informative Abstract The majority of abstracts are informative. While they still do not critique or evaluate a work, they do more than describe it. A good informative abstract acts as a surrogate for the work itself. That is, the researcher presents and explains all the main arguments and the important results and evidence in the paper. An informative abstract includes the information that can be found in a descriptive abstract [purpose, methods, scope] but it also includes the results and conclusions of the research and the recommendations of the author. The length varies according to discipline, but an informative abstract is usually no more than 300 words in length.
Highlight Abstract A highlight abstract is specifically written to attract the reader’s attention to the study. No pretense is made of there being either a balanced or complete picture of the paper and, in fact, incomplete and leading remarks may be used to spark the reader’s interest. In that a highlight abstract cannot stand independent of its associated article, it is not a true abstract and, therefore, rarely used in academic writing.
II. Writing Style
Use the active voice when possible , but note that much of your abstract may require passive sentence constructions. Regardless, write your abstract using concise, but complete, sentences. Get to the point quickly and always use the past tense because you are reporting on a study that has been completed.
Abstracts should be formatted as a single paragraph in a block format and with no paragraph indentations. In most cases, the abstract page immediately follows the title page. Do not number the page. Rules set forth in writing manual vary but, in general, you should center the word "Abstract" at the top of the page with double spacing between the heading and the abstract. The final sentences of an abstract concisely summarize your study’s conclusions, implications, or applications to practice and, if appropriate, can be followed by a statement about the need for additional research revealed from the findings.
Composing Your Abstract
Although it is the first section of your paper, the abstract should be written last since it will summarize the contents of your entire paper. A good strategy to begin composing your abstract is to take whole sentences or key phrases from each section of the paper and put them in a sequence that summarizes the contents. Then revise or add connecting phrases or words to make the narrative flow clearly and smoothly. Note that statistical findings should be reported parenthetically [i.e., written in parentheses].
Before handing in your final paper, check to make sure that the information in the abstract completely agrees with what you have written in the paper. Think of the abstract as a sequential set of complete sentences describing the most crucial information using the fewest necessary words. The abstract SHOULD NOT contain:
- A catchy introductory phrase, provocative quote, or other device to grab the reader's attention,
- Lengthy background or contextual information,
- Redundant phrases, unnecessary adverbs and adjectives, and repetitive information;
- Acronyms or abbreviations,
- References to other literature [say something like, "current research shows that..." or "studies have indicated..."],
- Using ellipticals [i.e., ending with "..."] or incomplete sentences,
- Jargon or terms that may be confusing to the reader,
- Citations to other works, and
- Any sort of image, illustration, figure, or table, or references to them.
Abstract. Writing Center. University of Kansas; Abstract. The Structure, Format, Content, and Style of a Journal-Style Scientific Paper. Department of Biology. Bates College; Abstracts. The Writing Center. University of North Carolina; Borko, Harold and Seymour Chatman. "Criteria for Acceptable Abstracts: A Survey of Abstracters' Instructions." American Documentation 14 (April 1963): 149-160; Abstracts. The Writer’s Handbook. Writing Center. University of Wisconsin, Madison; Hartley, James and Lucy Betts. "Common Weaknesses in Traditional Abstracts in hte Social Sciences." Journal of the American Society for Information Science and Technology 60 (October 2009): 2010-2018; Procter, Margaret. The Abstract. University College Writing Centre. University of Toronto; Riordan, Laura. “Mastering the Art of Abstracts.” The Journal of the American Osteopathic Association 115 (January 2015 ): 41-47; Writing Report Abstracts. The Writing Lab and The OWL. Purdue University; Writing Abstracts. Writing Tutorial Services, Center for Innovative Teaching and Learning. Indiana University; Koltay, Tibor. Abstracts and Abstracting: A Genre and Set of Skills for the Twenty-First Century . Oxford, UK: 2010; Writing an Abstract for Your Research Paper. The Writing Center, University of Wisconsin, Madison.
Writing Tip
Never Cite Just the Abstract!
Citing to just a journal article's abstract does not confirm for the reader that you have conducted a thorough or reliable review of the literature. If the full-text is not available, go to the USC Libraries main page and enter the title of the article [NOT the title of the journal]. If the Libraries have a subscription to the journal, the article should appear with a link to the full-text or to the journal publisher page where you can get the article. If the article does not appear, try searching Google Scholar using the link on the USC Libraries main page. If you still can't find the article after doing this, contact a librarian or you can request it from our free i nterlibrary loan and document delivery service .
- << Previous: Research Process Video Series
- Next: Executive Summary >>
- Last Updated: Oct 10, 2023 1:30 PM
- URL: https://libguides.usc.edu/writingguide

Scientific Reports
What this handout is about.
This handout provides a general guide to writing reports about scientific research you’ve performed. In addition to describing the conventional rules about the format and content of a lab report, we’ll also attempt to convey why these rules exist, so you’ll get a clearer, more dependable idea of how to approach this writing situation. Readers of this handout may also find our handout on writing in the sciences useful.
Background and pre-writing
Why do we write research reports.
You did an experiment or study for your science class, and now you have to write it up for your teacher to review. You feel that you understood the background sufficiently, designed and completed the study effectively, obtained useful data, and can use those data to draw conclusions about a scientific process or principle. But how exactly do you write all that? What is your teacher expecting to see?
To take some of the guesswork out of answering these questions, try to think beyond the classroom setting. In fact, you and your teacher are both part of a scientific community, and the people who participate in this community tend to share the same values. As long as you understand and respect these values, your writing will likely meet the expectations of your audience—including your teacher.
So why are you writing this research report? The practical answer is “Because the teacher assigned it,” but that’s classroom thinking. Generally speaking, people investigating some scientific hypothesis have a responsibility to the rest of the scientific world to report their findings, particularly if these findings add to or contradict previous ideas. The people reading such reports have two primary goals:
- They want to gather the information presented.
- They want to know that the findings are legitimate.
Your job as a writer, then, is to fulfill these two goals.
How do I do that?
Good question. Here is the basic format scientists have designed for research reports:
- Introduction
Methods and Materials
This format, sometimes called “IMRAD,” may take slightly different shapes depending on the discipline or audience; some ask you to include an abstract or separate section for the hypothesis, or call the Discussion section “Conclusions,” or change the order of the sections (some professional and academic journals require the Methods section to appear last). Overall, however, the IMRAD format was devised to represent a textual version of the scientific method.
The scientific method, you’ll probably recall, involves developing a hypothesis, testing it, and deciding whether your findings support the hypothesis. In essence, the format for a research report in the sciences mirrors the scientific method but fleshes out the process a little. Below, you’ll find a table that shows how each written section fits into the scientific method and what additional information it offers the reader.
Thinking of your research report as based on the scientific method, but elaborated in the ways described above, may help you to meet your audience’s expectations successfully. We’re going to proceed by explicitly connecting each section of the lab report to the scientific method, then explaining why and how you need to elaborate that section.
Although this handout takes each section in the order in which it should be presented in the final report, you may for practical reasons decide to compose sections in another order. For example, many writers find that composing their Methods and Results before the other sections helps to clarify their idea of the experiment or study as a whole. You might consider using each assignment to practice different approaches to drafting the report, to find the order that works best for you.
What should I do before drafting the lab report?
The best way to prepare to write the lab report is to make sure that you fully understand everything you need to about the experiment. Obviously, if you don’t quite know what went on during the lab, you’re going to find it difficult to explain the lab satisfactorily to someone else. To make sure you know enough to write the report, complete the following steps:
- What are we going to do in this lab? (That is, what’s the procedure?)
- Why are we going to do it that way?
- What are we hoping to learn from this experiment?
- Why would we benefit from this knowledge?
- Consult your lab supervisor as you perform the lab. If you don’t know how to answer one of the questions above, for example, your lab supervisor will probably be able to explain it to you (or, at least, help you figure it out).
- Plan the steps of the experiment carefully with your lab partners. The less you rush, the more likely it is that you’ll perform the experiment correctly and record your findings accurately. Also, take some time to think about the best way to organize the data before you have to start putting numbers down. If you can design a table to account for the data, that will tend to work much better than jotting results down hurriedly on a scrap piece of paper.
- Record the data carefully so you get them right. You won’t be able to trust your conclusions if you have the wrong data, and your readers will know you messed up if the other three people in your group have “97 degrees” and you have “87.”
- Consult with your lab partners about everything you do. Lab groups often make one of two mistakes: two people do all the work while two have a nice chat, or everybody works together until the group finishes gathering the raw data, then scrams outta there. Collaborate with your partners, even when the experiment is “over.” What trends did you observe? Was the hypothesis supported? Did you all get the same results? What kind of figure should you use to represent your findings? The whole group can work together to answer these questions.
- Consider your audience. You may believe that audience is a non-issue: it’s your lab TA, right? Well, yes—but again, think beyond the classroom. If you write with only your lab instructor in mind, you may omit material that is crucial to a complete understanding of your experiment, because you assume the instructor knows all that stuff already. As a result, you may receive a lower grade, since your TA won’t be sure that you understand all the principles at work. Try to write towards a student in the same course but a different lab section. That student will have a fair degree of scientific expertise but won’t know much about your experiment particularly. Alternatively, you could envision yourself five years from now, after the reading and lectures for this course have faded a bit. What would you remember, and what would you need explained more clearly (as a refresher)?
Once you’ve completed these steps as you perform the experiment, you’ll be in a good position to draft an effective lab report.
Introductions
How do i write a strong introduction.
For the purposes of this handout, we’ll consider the Introduction to contain four basic elements: the purpose, the scientific literature relevant to the subject, the hypothesis, and the reasons you believed your hypothesis viable. Let’s start by going through each element of the Introduction to clarify what it covers and why it’s important. Then we can formulate a logical organizational strategy for the section.
The inclusion of the purpose (sometimes called the objective) of the experiment often confuses writers. The biggest misconception is that the purpose is the same as the hypothesis. Not quite. We’ll get to hypotheses in a minute, but basically they provide some indication of what you expect the experiment to show. The purpose is broader, and deals more with what you expect to gain through the experiment. In a professional setting, the hypothesis might have something to do with how cells react to a certain kind of genetic manipulation, but the purpose of the experiment is to learn more about potential cancer treatments. Undergraduate reports don’t often have this wide-ranging a goal, but you should still try to maintain the distinction between your hypothesis and your purpose. In a solubility experiment, for example, your hypothesis might talk about the relationship between temperature and the rate of solubility, but the purpose is probably to learn more about some specific scientific principle underlying the process of solubility.
For starters, most people say that you should write out your working hypothesis before you perform the experiment or study. Many beginning science students neglect to do so and find themselves struggling to remember precisely which variables were involved in the process or in what way the researchers felt that they were related. Write your hypothesis down as you develop it—you’ll be glad you did.
As for the form a hypothesis should take, it’s best not to be too fancy or complicated; an inventive style isn’t nearly so important as clarity here. There’s nothing wrong with beginning your hypothesis with the phrase, “It was hypothesized that . . .” Be as specific as you can about the relationship between the different objects of your study. In other words, explain that when term A changes, term B changes in this particular way. Readers of scientific writing are rarely content with the idea that a relationship between two terms exists—they want to know what that relationship entails.
Not a hypothesis:
“It was hypothesized that there is a significant relationship between the temperature of a solvent and the rate at which a solute dissolves.”
Hypothesis:
“It was hypothesized that as the temperature of a solvent increases, the rate at which a solute will dissolve in that solvent increases.”
Put more technically, most hypotheses contain both an independent and a dependent variable. The independent variable is what you manipulate to test the reaction; the dependent variable is what changes as a result of your manipulation. In the example above, the independent variable is the temperature of the solvent, and the dependent variable is the rate of solubility. Be sure that your hypothesis includes both variables.
Justify your hypothesis
You need to do more than tell your readers what your hypothesis is; you also need to assure them that this hypothesis was reasonable, given the circumstances. In other words, use the Introduction to explain that you didn’t just pluck your hypothesis out of thin air. (If you did pluck it out of thin air, your problems with your report will probably extend beyond using the appropriate format.) If you posit that a particular relationship exists between the independent and the dependent variable, what led you to believe your “guess” might be supported by evidence?
Scientists often refer to this type of justification as “motivating” the hypothesis, in the sense that something propelled them to make that prediction. Often, motivation includes what we already know—or rather, what scientists generally accept as true (see “Background/previous research” below). But you can also motivate your hypothesis by relying on logic or on your own observations. If you’re trying to decide which solutes will dissolve more rapidly in a solvent at increased temperatures, you might remember that some solids are meant to dissolve in hot water (e.g., bouillon cubes) and some are used for a function precisely because they withstand higher temperatures (they make saucepans out of something). Or you can think about whether you’ve noticed sugar dissolving more rapidly in your glass of iced tea or in your cup of coffee. Even such basic, outside-the-lab observations can help you justify your hypothesis as reasonable.
Background/previous research
This part of the Introduction demonstrates to the reader your awareness of how you’re building on other scientists’ work. If you think of the scientific community as engaging in a series of conversations about various topics, then you’ll recognize that the relevant background material will alert the reader to which conversation you want to enter.
Generally speaking, authors writing journal articles use the background for slightly different purposes than do students completing assignments. Because readers of academic journals tend to be professionals in the field, authors explain the background in order to permit readers to evaluate the study’s pertinence for their own work. You, on the other hand, write toward a much narrower audience—your peers in the course or your lab instructor—and so you must demonstrate that you understand the context for the (presumably assigned) experiment or study you’ve completed. For example, if your professor has been talking about polarity during lectures, and you’re doing a solubility experiment, you might try to connect the polarity of a solid to its relative solubility in certain solvents. In any event, both professional researchers and undergraduates need to connect the background material overtly to their own work.
Organization of this section
Most of the time, writers begin by stating the purpose or objectives of their own work, which establishes for the reader’s benefit the “nature and scope of the problem investigated” (Day 1994). Once you have expressed your purpose, you should then find it easier to move from the general purpose, to relevant material on the subject, to your hypothesis. In abbreviated form, an Introduction section might look like this:
“The purpose of the experiment was to test conventional ideas about solubility in the laboratory [purpose] . . . According to Whitecoat and Labrat (1999), at higher temperatures the molecules of solvents move more quickly . . . We know from the class lecture that molecules moving at higher rates of speed collide with one another more often and thus break down more easily [background material/motivation] . . . Thus, it was hypothesized that as the temperature of a solvent increases, the rate at which a solute will dissolve in that solvent increases [hypothesis].”
Again—these are guidelines, not commandments. Some writers and readers prefer different structures for the Introduction. The one above merely illustrates a common approach to organizing material.
How do I write a strong Materials and Methods section?
As with any piece of writing, your Methods section will succeed only if it fulfills its readers’ expectations, so you need to be clear in your own mind about the purpose of this section. Let’s review the purpose as we described it above: in this section, you want to describe in detail how you tested the hypothesis you developed and also to clarify the rationale for your procedure. In science, it’s not sufficient merely to design and carry out an experiment. Ultimately, others must be able to verify your findings, so your experiment must be reproducible, to the extent that other researchers can follow the same procedure and obtain the same (or similar) results.
Here’s a real-world example of the importance of reproducibility. In 1989, physicists Stanley Pons and Martin Fleischman announced that they had discovered “cold fusion,” a way of producing excess heat and power without the nuclear radiation that accompanies “hot fusion.” Such a discovery could have great ramifications for the industrial production of energy, so these findings created a great deal of interest. When other scientists tried to duplicate the experiment, however, they didn’t achieve the same results, and as a result many wrote off the conclusions as unjustified (or worse, a hoax). To this day, the viability of cold fusion is debated within the scientific community, even though an increasing number of researchers believe it possible. So when you write your Methods section, keep in mind that you need to describe your experiment well enough to allow others to replicate it exactly.
With these goals in mind, let’s consider how to write an effective Methods section in terms of content, structure, and style.
Sometimes the hardest thing about writing this section isn’t what you should talk about, but what you shouldn’t talk about. Writers often want to include the results of their experiment, because they measured and recorded the results during the course of the experiment. But such data should be reserved for the Results section. In the Methods section, you can write that you recorded the results, or how you recorded the results (e.g., in a table), but you shouldn’t write what the results were—not yet. Here, you’re merely stating exactly how you went about testing your hypothesis. As you draft your Methods section, ask yourself the following questions:
- How much detail? Be precise in providing details, but stay relevant. Ask yourself, “Would it make any difference if this piece were a different size or made from a different material?” If not, you probably don’t need to get too specific. If so, you should give as many details as necessary to prevent this experiment from going awry if someone else tries to carry it out. Probably the most crucial detail is measurement; you should always quantify anything you can, such as time elapsed, temperature, mass, volume, etc.
- Rationale: Be sure that as you’re relating your actions during the experiment, you explain your rationale for the protocol you developed. If you capped a test tube immediately after adding a solute to a solvent, why did you do that? (That’s really two questions: why did you cap it, and why did you cap it immediately?) In a professional setting, writers provide their rationale as a way to explain their thinking to potential critics. On one hand, of course, that’s your motivation for talking about protocol, too. On the other hand, since in practical terms you’re also writing to your teacher (who’s seeking to evaluate how well you comprehend the principles of the experiment), explaining the rationale indicates that you understand the reasons for conducting the experiment in that way, and that you’re not just following orders. Critical thinking is crucial—robots don’t make good scientists.
- Control: Most experiments will include a control, which is a means of comparing experimental results. (Sometimes you’ll need to have more than one control, depending on the number of hypotheses you want to test.) The control is exactly the same as the other items you’re testing, except that you don’t manipulate the independent variable-the condition you’re altering to check the effect on the dependent variable. For example, if you’re testing solubility rates at increased temperatures, your control would be a solution that you didn’t heat at all; that way, you’ll see how quickly the solute dissolves “naturally” (i.e., without manipulation), and you’ll have a point of reference against which to compare the solutions you did heat.
Describe the control in the Methods section. Two things are especially important in writing about the control: identify the control as a control, and explain what you’re controlling for. Here is an example:
“As a control for the temperature change, we placed the same amount of solute in the same amount of solvent, and let the solution stand for five minutes without heating it.”
Structure and style
Organization is especially important in the Methods section of a lab report because readers must understand your experimental procedure completely. Many writers are surprised by the difficulty of conveying what they did during the experiment, since after all they’re only reporting an event, but it’s often tricky to present this information in a coherent way. There’s a fairly standard structure you can use to guide you, and following the conventions for style can help clarify your points.
- Subsections: Occasionally, researchers use subsections to report their procedure when the following circumstances apply: 1) if they’ve used a great many materials; 2) if the procedure is unusually complicated; 3) if they’ve developed a procedure that won’t be familiar to many of their readers. Because these conditions rarely apply to the experiments you’ll perform in class, most undergraduate lab reports won’t require you to use subsections. In fact, many guides to writing lab reports suggest that you try to limit your Methods section to a single paragraph.
- Narrative structure: Think of this section as telling a story about a group of people and the experiment they performed. Describe what you did in the order in which you did it. You may have heard the old joke centered on the line, “Disconnect the red wire, but only after disconnecting the green wire,” where the person reading the directions blows everything to kingdom come because the directions weren’t in order. We’re used to reading about events chronologically, and so your readers will generally understand what you did if you present that information in the same way. Also, since the Methods section does generally appear as a narrative (story), you want to avoid the “recipe” approach: “First, take a clean, dry 100 ml test tube from the rack. Next, add 50 ml of distilled water.” You should be reporting what did happen, not telling the reader how to perform the experiment: “50 ml of distilled water was poured into a clean, dry 100 ml test tube.” Hint: most of the time, the recipe approach comes from copying down the steps of the procedure from your lab manual, so you may want to draft the Methods section initially without consulting your manual. Later, of course, you can go back and fill in any part of the procedure you inadvertently overlooked.
- Past tense: Remember that you’re describing what happened, so you should use past tense to refer to everything you did during the experiment. Writers are often tempted to use the imperative (“Add 5 g of the solid to the solution”) because that’s how their lab manuals are worded; less frequently, they use present tense (“5 g of the solid are added to the solution”). Instead, remember that you’re talking about an event which happened at a particular time in the past, and which has already ended by the time you start writing, so simple past tense will be appropriate in this section (“5 g of the solid were added to the solution” or “We added 5 g of the solid to the solution”).
- Active: We heated the solution to 80°C. (The subject, “we,” performs the action, heating.)
- Passive: The solution was heated to 80°C. (The subject, “solution,” doesn’t do the heating–it is acted upon, not acting.)
Increasingly, especially in the social sciences, using first person and active voice is acceptable in scientific reports. Most readers find that this style of writing conveys information more clearly and concisely. This rhetorical choice thus brings two scientific values into conflict: objectivity versus clarity. Since the scientific community hasn’t reached a consensus about which style it prefers, you may want to ask your lab instructor.
How do I write a strong Results section?
Here’s a paradox for you. The Results section is often both the shortest (yay!) and most important (uh-oh!) part of your report. Your Materials and Methods section shows how you obtained the results, and your Discussion section explores the significance of the results, so clearly the Results section forms the backbone of the lab report. This section provides the most critical information about your experiment: the data that allow you to discuss how your hypothesis was or wasn’t supported. But it doesn’t provide anything else, which explains why this section is generally shorter than the others.
Before you write this section, look at all the data you collected to figure out what relates significantly to your hypothesis. You’ll want to highlight this material in your Results section. Resist the urge to include every bit of data you collected, since perhaps not all are relevant. Also, don’t try to draw conclusions about the results—save them for the Discussion section. In this section, you’re reporting facts. Nothing your readers can dispute should appear in the Results section.
Most Results sections feature three distinct parts: text, tables, and figures. Let’s consider each part one at a time.
This should be a short paragraph, generally just a few lines, that describes the results you obtained from your experiment. In a relatively simple experiment, one that doesn’t produce a lot of data for you to repeat, the text can represent the entire Results section. Don’t feel that you need to include lots of extraneous detail to compensate for a short (but effective) text; your readers appreciate discrimination more than your ability to recite facts. In a more complex experiment, you may want to use tables and/or figures to help guide your readers toward the most important information you gathered. In that event, you’ll need to refer to each table or figure directly, where appropriate:
“Table 1 lists the rates of solubility for each substance”
“Solubility increased as the temperature of the solution increased (see Figure 1).”
If you do use tables or figures, make sure that you don’t present the same material in both the text and the tables/figures, since in essence you’ll just repeat yourself, probably annoying your readers with the redundancy of your statements.
Feel free to describe trends that emerge as you examine the data. Although identifying trends requires some judgment on your part and so may not feel like factual reporting, no one can deny that these trends do exist, and so they properly belong in the Results section. Example:
“Heating the solution increased the rate of solubility of polar solids by 45% but had no effect on the rate of solubility in solutions containing non-polar solids.”
This point isn’t debatable—you’re just pointing out what the data show.
As in the Materials and Methods section, you want to refer to your data in the past tense, because the events you recorded have already occurred and have finished occurring. In the example above, note the use of “increased” and “had,” rather than “increases” and “has.” (You don’t know from your experiment that heating always increases the solubility of polar solids, but it did that time.)
You shouldn’t put information in the table that also appears in the text. You also shouldn’t use a table to present irrelevant data, just to show you did collect these data during the experiment. Tables are good for some purposes and situations, but not others, so whether and how you’ll use tables depends upon what you need them to accomplish.
Tables are useful ways to show variation in data, but not to present a great deal of unchanging measurements. If you’re dealing with a scientific phenomenon that occurs only within a certain range of temperatures, for example, you don’t need to use a table to show that the phenomenon didn’t occur at any of the other temperatures. How useful is this table?

As you can probably see, no solubility was observed until the trial temperature reached 50°C, a fact that the text part of the Results section could easily convey. The table could then be limited to what happened at 50°C and higher, thus better illustrating the differences in solubility rates when solubility did occur.
As a rule, try not to use a table to describe any experimental event you can cover in one sentence of text. Here’s an example of an unnecessary table from How to Write and Publish a Scientific Paper , by Robert A. Day:

As Day notes, all the information in this table can be summarized in one sentence: “S. griseus, S. coelicolor, S. everycolor, and S. rainbowenski grew under aerobic conditions, whereas S. nocolor and S. greenicus required anaerobic conditions.” Most readers won’t find the table clearer than that one sentence.
When you do have reason to tabulate material, pay attention to the clarity and readability of the format you use. Here are a few tips:
- Number your table. Then, when you refer to the table in the text, use that number to tell your readers which table they can review to clarify the material.
- Give your table a title. This title should be descriptive enough to communicate the contents of the table, but not so long that it becomes difficult to follow. The titles in the sample tables above are acceptable.
- Arrange your table so that readers read vertically, not horizontally. For the most part, this rule means that you should construct your table so that like elements read down, not across. Think about what you want your readers to compare, and put that information in the column (up and down) rather than in the row (across). Usually, the point of comparison will be the numerical data you collect, so especially make sure you have columns of numbers, not rows.Here’s an example of how drastically this decision affects the readability of your table (from A Short Guide to Writing about Chemistry , by Herbert Beall and John Trimbur). Look at this table, which presents the relevant data in horizontal rows:

It’s a little tough to see the trends that the author presumably wants to present in this table. Compare this table, in which the data appear vertically:

The second table shows how putting like elements in a vertical column makes for easier reading. In this case, the like elements are the measurements of length and height, over five trials–not, as in the first table, the length and height measurements for each trial.
- Make sure to include units of measurement in the tables. Readers might be able to guess that you measured something in millimeters, but don’t make them try.
- Don’t use vertical lines as part of the format for your table. This convention exists because journals prefer not to have to reproduce these lines because the tables then become more expensive to print. Even though it’s fairly unlikely that you’ll be sending your Biology 11 lab report to Science for publication, your readers still have this expectation. Consequently, if you use the table-drawing option in your word-processing software, choose the option that doesn’t rely on a “grid” format (which includes vertical lines).
How do I include figures in my report?
Although tables can be useful ways of showing trends in the results you obtained, figures (i.e., illustrations) can do an even better job of emphasizing such trends. Lab report writers often use graphic representations of the data they collected to provide their readers with a literal picture of how the experiment went.
When should you use a figure?
Remember the circumstances under which you don’t need a table: when you don’t have a great deal of data or when the data you have don’t vary a lot. Under the same conditions, you would probably forgo the figure as well, since the figure would be unlikely to provide your readers with an additional perspective. Scientists really don’t like their time wasted, so they tend not to respond favorably to redundancy.
If you’re trying to decide between using a table and creating a figure to present your material, consider the following a rule of thumb. The strength of a table lies in its ability to supply large amounts of exact data, whereas the strength of a figure is its dramatic illustration of important trends within the experiment. If you feel that your readers won’t get the full impact of the results you obtained just by looking at the numbers, then a figure might be appropriate.
Of course, an undergraduate class may expect you to create a figure for your lab experiment, if only to make sure that you can do so effectively. If this is the case, then don’t worry about whether to use figures or not—concentrate instead on how best to accomplish your task.
Figures can include maps, photographs, pen-and-ink drawings, flow charts, bar graphs, and section graphs (“pie charts”). But the most common figure by far, especially for undergraduates, is the line graph, so we’ll focus on that type in this handout.
At the undergraduate level, you can often draw and label your graphs by hand, provided that the result is clear, legible, and drawn to scale. Computer technology has, however, made creating line graphs a lot easier. Most word-processing software has a number of functions for transferring data into graph form; many scientists have found Microsoft Excel, for example, a helpful tool in graphing results. If you plan on pursuing a career in the sciences, it may be well worth your while to learn to use a similar program.
Computers can’t, however, decide for you how your graph really works; you have to know how to design your graph to meet your readers’ expectations. Here are some of these expectations:
- Keep it as simple as possible. You may be tempted to signal the complexity of the information you gathered by trying to design a graph that accounts for that complexity. But remember the purpose of your graph: to dramatize your results in a manner that’s easy to see and grasp. Try not to make the reader stare at the graph for a half hour to find the important line among the mass of other lines. For maximum effectiveness, limit yourself to three to five lines per graph; if you have more data to demonstrate, use a set of graphs to account for it, rather than trying to cram it all into a single figure.
- Plot the independent variable on the horizontal (x) axis and the dependent variable on the vertical (y) axis. Remember that the independent variable is the condition that you manipulated during the experiment and the dependent variable is the condition that you measured to see if it changed along with the independent variable. Placing the variables along their respective axes is mostly just a convention, but since your readers are accustomed to viewing graphs in this way, you’re better off not challenging the convention in your report.
- Label each axis carefully, and be especially careful to include units of measure. You need to make sure that your readers understand perfectly well what your graph indicates.
- Number and title your graphs. As with tables, the title of the graph should be informative but concise, and you should refer to your graph by number in the text (e.g., “Figure 1 shows the increase in the solubility rate as a function of temperature”).
- Many editors of professional scientific journals prefer that writers distinguish the lines in their graphs by attaching a symbol to them, usually a geometric shape (triangle, square, etc.), and using that symbol throughout the curve of the line. Generally, readers have a hard time distinguishing dotted lines from dot-dash lines from straight lines, so you should consider staying away from this system. Editors don’t usually like different-colored lines within a graph because colors are difficult and expensive to reproduce; colors may, however, be great for your purposes, as long as you’re not planning to submit your paper to Nature. Use your discretion—try to employ whichever technique dramatizes the results most effectively.
- Try to gather data at regular intervals, so the plot points on your graph aren’t too far apart. You can’t be sure of the arc you should draw between the plot points if the points are located at the far corners of the graph; over a fifteen-minute interval, perhaps the change occurred in the first or last thirty seconds of that period (in which case your straight-line connection between the points is misleading).
- If you’re worried that you didn’t collect data at sufficiently regular intervals during your experiment, go ahead and connect the points with a straight line, but you may want to examine this problem as part of your Discussion section.
- Make your graph large enough so that everything is legible and clearly demarcated, but not so large that it either overwhelms the rest of the Results section or provides a far greater range than you need to illustrate your point. If, for example, the seedlings of your plant grew only 15 mm during the trial, you don’t need to construct a graph that accounts for 100 mm of growth. The lines in your graph should more or less fill the space created by the axes; if you see that your data is confined to the lower left portion of the graph, you should probably re-adjust your scale.
- If you create a set of graphs, make them the same size and format, including all the verbal and visual codes (captions, symbols, scale, etc.). You want to be as consistent as possible in your illustrations, so that your readers can easily make the comparisons you’re trying to get them to see.
How do I write a strong Discussion section?
The discussion section is probably the least formalized part of the report, in that you can’t really apply the same structure to every type of experiment. In simple terms, here you tell your readers what to make of the Results you obtained. If you have done the Results part well, your readers should already recognize the trends in the data and have a fairly clear idea of whether your hypothesis was supported. Because the Results can seem so self-explanatory, many students find it difficult to know what material to add in this last section.
Basically, the Discussion contains several parts, in no particular order, but roughly moving from specific (i.e., related to your experiment only) to general (how your findings fit in the larger scientific community). In this section, you will, as a rule, need to:
Explain whether the data support your hypothesis
- Acknowledge any anomalous data or deviations from what you expected
Derive conclusions, based on your findings, about the process you’re studying
- Relate your findings to earlier work in the same area (if you can)
Explore the theoretical and/or practical implications of your findings
Let’s look at some dos and don’ts for each of these objectives.
This statement is usually a good way to begin the Discussion, since you can’t effectively speak about the larger scientific value of your study until you’ve figured out the particulars of this experiment. You might begin this part of the Discussion by explicitly stating the relationships or correlations your data indicate between the independent and dependent variables. Then you can show more clearly why you believe your hypothesis was or was not supported. For example, if you tested solubility at various temperatures, you could start this section by noting that the rates of solubility increased as the temperature increased. If your initial hypothesis surmised that temperature change would not affect solubility, you would then say something like,
“The hypothesis that temperature change would not affect solubility was not supported by the data.”
Note: Students tend to view labs as practical tests of undeniable scientific truths. As a result, you may want to say that the hypothesis was “proved” or “disproved” or that it was “correct” or “incorrect.” These terms, however, reflect a degree of certainty that you as a scientist aren’t supposed to have. Remember, you’re testing a theory with a procedure that lasts only a few hours and relies on only a few trials, which severely compromises your ability to be sure about the “truth” you see. Words like “supported,” “indicated,” and “suggested” are more acceptable ways to evaluate your hypothesis.
Also, recognize that saying whether the data supported your hypothesis or not involves making a claim to be defended. As such, you need to show the readers that this claim is warranted by the evidence. Make sure that you’re very explicit about the relationship between the evidence and the conclusions you draw from it. This process is difficult for many writers because we don’t often justify conclusions in our regular lives. For example, you might nudge your friend at a party and whisper, “That guy’s drunk,” and once your friend lays eyes on the person in question, she might readily agree. In a scientific paper, by contrast, you would need to defend your claim more thoroughly by pointing to data such as slurred words, unsteady gait, and the lampshade-as-hat. In addition to pointing out these details, you would also need to show how (according to previous studies) these signs are consistent with inebriation, especially if they occur in conjunction with one another. To put it another way, tell your readers exactly how you got from point A (was the hypothesis supported?) to point B (yes/no).
Acknowledge any anomalous data, or deviations from what you expected
You need to take these exceptions and divergences into account, so that you qualify your conclusions sufficiently. For obvious reasons, your readers will doubt your authority if you (deliberately or inadvertently) overlook a key piece of data that doesn’t square with your perspective on what occurred. In a more philosophical sense, once you’ve ignored evidence that contradicts your claims, you’ve departed from the scientific method. The urge to “tidy up” the experiment is often strong, but if you give in to it you’re no longer performing good science.
Sometimes after you’ve performed a study or experiment, you realize that some part of the methods you used to test your hypothesis was flawed. In that case, it’s OK to suggest that if you had the chance to conduct your test again, you might change the design in this or that specific way in order to avoid such and such a problem. The key to making this approach work, though, is to be very precise about the weakness in your experiment, why and how you think that weakness might have affected your data, and how you would alter your protocol to eliminate—or limit the effects of—that weakness. Often, inexperienced researchers and writers feel the need to account for “wrong” data (remember, there’s no such animal), and so they speculate wildly about what might have screwed things up. These speculations include such factors as the unusually hot temperature in the room, or the possibility that their lab partners read the meters wrong, or the potentially defective equipment. These explanations are what scientists call “cop-outs,” or “lame”; don’t indicate that the experiment had a weakness unless you’re fairly certain that a) it really occurred and b) you can explain reasonably well how that weakness affected your results.
If, for example, your hypothesis dealt with the changes in solubility at different temperatures, then try to figure out what you can rationally say about the process of solubility more generally. If you’re doing an undergraduate lab, chances are that the lab will connect in some way to the material you’ve been covering either in lecture or in your reading, so you might choose to return to these resources as a way to help you think clearly about the process as a whole.
This part of the Discussion section is another place where you need to make sure that you’re not overreaching. Again, nothing you’ve found in one study would remotely allow you to claim that you now “know” something, or that something isn’t “true,” or that your experiment “confirmed” some principle or other. Hesitate before you go out on a limb—it’s dangerous! Use less absolutely conclusive language, including such words as “suggest,” “indicate,” “correspond,” “possibly,” “challenge,” etc.
Relate your findings to previous work in the field (if possible)
We’ve been talking about how to show that you belong in a particular community (such as biologists or anthropologists) by writing within conventions that they recognize and accept. Another is to try to identify a conversation going on among members of that community, and use your work to contribute to that conversation. In a larger philosophical sense, scientists can’t fully understand the value of their research unless they have some sense of the context that provoked and nourished it. That is, you have to recognize what’s new about your project (potentially, anyway) and how it benefits the wider body of scientific knowledge. On a more pragmatic level, especially for undergraduates, connecting your lab work to previous research will demonstrate to the TA that you see the big picture. You have an opportunity, in the Discussion section, to distinguish yourself from the students in your class who aren’t thinking beyond the barest facts of the study. Capitalize on this opportunity by putting your own work in context.
If you’re just beginning to work in the natural sciences (as a first-year biology or chemistry student, say), most likely the work you’ll be doing has already been performed and re-performed to a satisfactory degree. Hence, you could probably point to a similar experiment or study and compare/contrast your results and conclusions. More advanced work may deal with an issue that is somewhat less “resolved,” and so previous research may take the form of an ongoing debate, and you can use your own work to weigh in on that debate. If, for example, researchers are hotly disputing the value of herbal remedies for the common cold, and the results of your study suggest that Echinacea diminishes the symptoms but not the actual presence of the cold, then you might want to take some time in the Discussion section to recapitulate the specifics of the dispute as it relates to Echinacea as an herbal remedy. (Consider that you have probably already written in the Introduction about this debate as background research.)
This information is often the best way to end your Discussion (and, for all intents and purposes, the report). In argumentative writing generally, you want to use your closing words to convey the main point of your writing. This main point can be primarily theoretical (“Now that you understand this information, you’re in a better position to understand this larger issue”) or primarily practical (“You can use this information to take such and such an action”). In either case, the concluding statements help the reader to comprehend the significance of your project and your decision to write about it.
Since a lab report is argumentative—after all, you’re investigating a claim, and judging the legitimacy of that claim by generating and collecting evidence—it’s often a good idea to end your report with the same technique for establishing your main point. If you want to go the theoretical route, you might talk about the consequences your study has for the field or phenomenon you’re investigating. To return to the examples regarding solubility, you could end by reflecting on what your work on solubility as a function of temperature tells us (potentially) about solubility in general. (Some folks consider this type of exploration “pure” as opposed to “applied” science, although these labels can be problematic.) If you want to go the practical route, you could end by speculating about the medical, institutional, or commercial implications of your findings—in other words, answer the question, “What can this study help people to do?” In either case, you’re going to make your readers’ experience more satisfying, by helping them see why they spent their time learning what you had to teach them.
Works consulted
We consulted these works while writing this handout. This is not a comprehensive list of resources on the handout’s topic, and we encourage you to do your own research to find additional publications. Please do not use this list as a model for the format of your own reference list, as it may not match the citation style you are using. For guidance on formatting citations, please see the UNC Libraries citation tutorial . We revise these tips periodically and welcome feedback.
American Psychological Association. 2010. Publication Manual of the American Psychological Association . 6th ed. Washington, DC: American Psychological Association.
Beall, Herbert, and John Trimbur. 2001. A Short Guide to Writing About Chemistry , 2nd ed. New York: Longman.
Blum, Deborah, and Mary Knudson. 1997. A Field Guide for Science Writers: The Official Guide of the National Association of Science Writers . New York: Oxford University Press.
Booth, Wayne C., Gregory G. Colomb, Joseph M. Williams, Joseph Bizup, and William T. FitzGerald. 2016. The Craft of Research , 4th ed. Chicago: University of Chicago Press.
Briscoe, Mary Helen. 1996. Preparing Scientific Illustrations: A Guide to Better Posters, Presentations, and Publications , 2nd ed. New York: Springer-Verlag.
Council of Science Editors. 2014. Scientific Style and Format: The CSE Manual for Authors, Editors, and Publishers , 8th ed. Chicago & London: University of Chicago Press.
Davis, Martha. 2012. Scientific Papers and Presentations , 3rd ed. London: Academic Press.
Day, Robert A. 1994. How to Write and Publish a Scientific Paper , 4th ed. Phoenix: Oryx Press.
Porush, David. 1995. A Short Guide to Writing About Science . New York: Longman.
Williams, Joseph, and Joseph Bizup. 2017. Style: Lessons in Clarity and Grace , 12th ed. Boston: Pearson.
You may reproduce it for non-commercial use if you use the entire handout and attribute the source: The Writing Center, University of North Carolina at Chapel Hill
Make a Gift

- Study Documents
- Learning Tools
Writing Guides
- Citation Generator
- Flash Card Generator
Writing Guides / 15 Abstract Examples: A Comprehensive Guide
15 Abstract Examples: A Comprehensive Guide

Demystifying Abstract Writing
An abstract represents a concise, well-articulated summary of an academic piece or research. But writing an abstract goes beyond merely creating a summary. In this piece, we’ll delve into examples of abstracts to illuminate what they truly are, along with the necessary tone, style, and word counts.
You’ll also see how diverse abstract writing can be, tailored according to the subject area. For instance, an abstract for empirical research in the sciences contrasts greatly from that of a humanities article.
View 120,000+ High Quality Essay Examples
Learn-by-example to improve your academic writing
The Importance of Abstracts: Why Do We Write Them?
Every abstract you encounter, including our abstract writing example, has a few core characteristics. The primary role of an abstract is to encapsulate the essential points of a research article, much like a book’s back cover. The back jacket often influences whether you buy the book or not.
Similarly, academic papers are often behind paywalls, and the abstract assists readers in deciding if they should purchase the article. If you’re a student or researcher, the abstract helps you gauge whether the article is worth your time.
Furthermore, abstracts promote ongoing research in your field by incorporating keywords that allow others to locate your work. Knowing how to write a good abstract contributes to your professionalism, especially crucial for graduate-level studies. This skill might be vital when submitting your research to peer-reviewed journals or soliciting grant funding.
Breaking Down an Abstract: What’s Inside?
The contents of an abstract heavily rely on the type of study, research design, and subject area. An abstract may contain a succinct background statement highlighting the research’s significance, a problem statement, the methodologies used, a synopsis of the results, and the conclusions drawn.
When it comes to writing an abstract for a research paper, striking a balance between consciousness and informative detail is essential. Our examples of abstracts will help you grasp this balance better.
Moreover, you’ll learn how to format abstracts variably, matching the requirements of your degree program or publication guidelines.
Key Elements to Include in Your Abstract
- Brief Background: Introduce the importance of the research from your point of view.
- Problem Statement: Define the issue your research addresses, commonly referred to as the thesis statement.
- Methodology: Describe the research methods you employed.
- Synopsis: This should include a summary of your results and conclusions.
- Keywords: Implement terms that others will use to find your article.
Types of Abstracts
- Descriptive Abstracts: These give an overview of the source material without delving into results and conclusions.
- Informative Abstracts: These offer a more detailed look into your research, including the purpose, methods, results, and conclusions.
- Always write your abstract in the present tense.
- Keep track of word counts to maintain brevity.
- The original text should guide your abstract.
- Always provide a good synopsis in your abstract.
- If needed, use your abstract to draft a compelling query letter.
- Consider providing a literature review abstract if your research involves an extensive review of existing literature.
Types of Abstract
According to the Purdue Online Writing Lab resource, there are two different types of abstract: informational and descriptive.
Although informative and descriptive abstracts seem similar, they are different in a few key ways.
An informative abstract contains all the information related to the research, including the results and the conclusion.
A descriptive abstract is typically much shorter, and does not provide as much information. Rather, the descriptive abstract just tells the reader what the research or the article is about and not much more.
The descriptive abstract is more of a tagline or a teaser, whereas the informative abstract is more like a summary.
You will find both types of abstracts in the examples below.
Abstract Examples
Informative abstract example 1.
Emotional intelligence (EQ) has been correlated with leadership effectiveness in organizations. Using a mixed-methods approach, this study assesses the importance of emotional intelligence on academic performance at the high school level.
The Emotional Intelligence rating scale was used, as well as semi-structured interviews with teachers. Participant grades were collected. Emotional intelligence was found to correlate positively with academic success. Implications for pedagogical practice are discussed.
Explanation
This is a typical informative abstract for empirical social sciences research. Most informative abstracts proceed in a logical fashion to reflect the organization of the main paper: with sections on the background, methods, results, and conclusions.
Informative Abstract Example 2
Social learning takes place through observations of others within a community. In diverse urban landscapes and through digital media, social learning may be qualitatively different from the social learning that takes place within families and tightly-knit social circles.
This study examines the differences between social learning that takes place in the home versus social learning that takes place from watching celebrities and other role models online. Results show that social learning takes place with equal efficacy. These results show that social learning does not just take place within known social circles, and that observations of others can lead to multiple types of learning.
This is a typical informative abstract for empirical social science research. After the background statement, the author discusses the problem statement or research question, followed by the results and the conclusions.
Informative Abstract Example 3
Few studies have examined the connection between visual imagery and emotional reactions to news media consumption. This study addresses the gap in the literature via the use of content analysis. Content analysis methods were used to analyze five news media television sites over the course of six months.
Using the Yolanda Metrics method, the researchers ascertained ten main words that were used throughout each of the news media sites. Implications and suggestions for future research are included.
This abstract provides an informative synopsis of a quantitative study on content analysis. The author provides the background information, addresses the methods, and also outlines the conclusions of the research.
Informative Abstract Example 4
This study explores the relationship between nurse educator theoretical viewpoints and nursing outcomes. Using a qualitative descriptive study, the researchers conducted face-to-face interviews with nursing students and nurse educators. The results show that nurse educator theoretical viewpoints had a direct bearing on nurse self-concept. Nurse educators should be cognizant of their biases and theoretical viewpoints when instructing students.
This example showcases how to write an abstract for a qualitative study. Qualitative studies also have clearly defined research methods. Therefore, it is important to keep in mind the general principles of informative abstract writing. Always begin with the research question or problem statement, and proceed to offer a one-sentence description of study methods and results.
Informative Abstract Example 5
Aboriginal people have poorer health outcomes versus their counterparts from other ethnic groups. In this study, public health researchers conducted an epidemiological data analysis using results from the Transcultural Health Report. Using a chi-square test, the researchers found that there is a direct correlation between ethnicity and health status. Policymakers should consider introducing methods for reducing health disparities among minority groups.
This informative abstract details the methods used in the report. As with other informative abstracts, it is written in the past tense. The abstract provides the reader with a summary of the research that has already been conducted.
Informative Abstract Example 6
We examine the contradictions of decolonization as official state policy. Using themes related to decolonization from the literature, we discuss how oppressed people develop cogent policies that create new systems of power. Intersectionality is also discussed.
Through a historical analysis, it was found that decolonization and political identity construction take place not as reactionary pathways but as deliberate means of regaining access to power and privilege. The cultivation of new political and social identities promotes social cohesion in formerly colonized nation-states, paving the way for future means of identity construction.
This abstract is informative but because it does not involve a unique empirical research design, it is written in a different manner from other informative abstracts. The researchers use tone, style, and diction that parallels that which takes place within the body of the text. The main themes are elucidated.
Informative Abstract Example 7
The implementation of a nationwide mandatory vaccination program against influenza in the country of Maconda was designed to lower rates of preventable illnesses. This study was designed to measure the cost-effectiveness of the mandatory vaccination program.
This is a cohort study designed to assess the rates of new influenza cases among both children (age > 8 years) and adults (age > 18 years). Using the National Reference Data Report of Maconda, the researchers compiled new case data (n = 2034) from 2014 to 2018.
A total of 45 new cases were reported during the years of 2014 and 2015, and after that, the number of new cases dropped by 74%.
The significant decrease in new influenza cases can be attributed to the introduction of mandatory vaccination.
Interpretation
The mandatory vaccination program proves cost-effective given its efficacy in controlling the disease.
This method of writing an informative abstract divides the content into respective subject headers. This style makes the abstract easier for some readers to scan quickly.
Informative Abstract Example 8
Mindfulness-based meditation and mindfulness-based stress reduction techniques have been shown to reduce burnout and improve employee engagement. Using a pretest/posttest design, the researchers randomly assigned nurses (n = 136) to the control and experimental groups. The Kabat-Zinn mindfulness-based stress reduction technique was used as the primary intervention for the experimental group.
Quantitative findings revealed significant improvements on self-report scales for depression and anxiety. Nurse leaders and administrators should consider implementing a mindfulness-based stress reduction program to reduce burnout and improve overall nurse performance.
This abstract contains all the necessary information you would need to make an assessment of whether the research was pertinent to your study. When you are writing an informative abstract, consider taking one sentence from each of the sections in your research (introduction/background, methods, results, and conclusion).
Descriptive Abstract Example 1
What inspires individuals to become members of a new religious movement, or a “cult”? This review of the literature offers some suggestions as to the psychological and sociological motivations for joining a new religious movement, offering suggestions for future research.
Unlike informative abstracts, descriptive abstracts simply alert the reader of the main gist of the article. Reading this abstract does not tell you exactly what the researchers found out about their subject, but it does let the reader know what the overall subject matter was and the methods used to conduct the research.
Descriptive Abstract Example 2
With few remaining survivors of the Holocaust, it becomes critical for historians to gather as much data that can contribute to an overall understanding of the ways trauma has been incorporated into identity. Interviews with five Holocaust survivors reveal new information about the role that art and music played in self-healing and community healing.
This descriptive abstract does not give too much information away, simply telling the reader that the researcher used interviews and a case study research design. Although it is a brief description of the study, the researchers succinctly summarize the contents and results.
Descriptive Abstract Example 3
Absurdist theater and literature have had a strong influence on playwrights in France and England. This analysis of absurdist theater addresses the primary symbols being used in absurdist literature and traces the evolution of those symbols as they parallel historical events.
As with most descriptive abstracts, this example is short. You can use descriptive abstracts to provide the reader with a summary of non-empirical research such as literary criticism.
Descriptive Abstract Example 4
The architecture of Oscar Niemeyer reflects socialist sensibilities in the urban planning of Brasilia. This research explores the philosophical underpinnings of Niemeyer’s design through an analysis of several of the main elements of the National Congress of Brazil. Implications and influences of Niemeyer’s work are also discussed.
Note how with the descriptive abstract, you are writing about the research in a more abstract and detached way than when you write an informative abstract.
Descriptive Abstract Example 5
Jacques Derrida has written extensively on the symbolism and the metonymy of September 11. In this research, we critique Derrida’s position, on the grounds that terrorism is better understood from within a neo-realist framework. Derrida’s analysis lacks coherence, is pompous and verbose, and is unnecessarily abstract when considering the need for a cogent counterterrorism strategy.
Like most descriptive abstracts, this encapsulates the main idea of the research but does not necessarily follow the same format as you might use in an informative abstract. Whereas an informative abstract follows the chronological format used in the paper you present, with introduction, methods, findings, and conclusion, a descriptive abstract only focuses on the main idea.
Descriptive Abstract Example 6
The Five Factor model of personality has been well established in the literature and is one of the most reliable and valid methods of assessing success. In this study, we use the Five Factor model to show when the qualities of neuroticism and introversion, which have been typically linked with low rates of success, are actually correlated with achievement in certain job sectors. Implications and suggestions for clinicians are discussed.
This descriptive abstract does not discuss the methodology used in the research, which is what differentiates it from an informative abstract. However, the description does include the basic elements contained in the report.
Descriptive Abstract Example 7
This is a case study of a medium-sized company, analyzing the competencies required for entering into the Indian retail market. Focusing on Mumbai and Bangalore, the expansion into these markets reveals potential challenges for European firms. A comparison case with a failed expansion into Wuhan, China is given, offering an explanation for how there are no global cross-cultural competencies that can be applied in all cases.
While this descriptive abstract shows the reader what the paper addresses, the methods and results are omitted. A descriptive abstract is shorter than an informative abstract.
Which Type of Abstract Should I Use?
Check with your professors or academic advisors, or with the editor of the peer-reviewed journal before determining which type of abstract is right for you.
If you have conducted original empirical research in the social sciences, you will most likely want to use an informative abstract.
However, when you are writing about the arts or humanities, a descriptive abstract might work best.
What Information Should I Include in An Abstract?
The information you include in the abstract will depend on the substantive content of your report.
Consider breaking down your abstract into five separate components, corresponding roughly with the structure of your original research.
You can write one or two sentences on each of these sections:
For Original Empirical Research
1. Background/Introductory Sentence
If you have conducted, or are going to conduct, an original research, then consider the following elements for your abstract:
What was your hypothesis?
What has the previous literature said about your subject?
What was the gap in the literature you are filling with your research?
What are the research questions?
What problem are you trying to solve?
What theoretical viewpoint or approach did you take?
What was your research design (qualitative, quantitative, multi-factorial, mixed-methods)?
What was the setting? Did you conduct a clinical analysis? Or did you conduct a systematic review of literature or a meta-analysis of data?
How many subjects were there?
How did you collect data?
How did you analyze the data?
What methodological weaknesses need to be mentioned?
III. Results
If this was a qualitative study, what were the major findings?
If this was a quantitative study, what were the major findings? Was there an alpha coefficient? What was the standard deviation?
Were the results statistically significant?
1. Discussion
Did the results prove or disprove the hypothesis ?
Were the results significant enough to inform future research?
How do your results link up with previous research? Does your research confirm or go beyond prior literature?
1. Conclusions/Recommendations
What do your results say about the research question or problem statement?
If you had to make a policy recommendation or offer suggestions to other scholars, what would you say?
Are there any concluding thoughts or overarching impressions?
Writing Abstracts for Literary Criticism and Humanities Research
Writing abstracts for research that is not empirical in nature does not involve the same steps as you might use when composing an abstract for the sciences or social sciences.
When writing an abstract for the arts and humanities, consider the following outline, writing one or two sentences for each section:
1. Background/Introduction
What other scholars have said before.
Why you agree or disagree.
Why this is important to study.
1. Your methods or approach
How did you conduct your research?
Did you analyze a specific text, case study, or work of art?
Are you comparing and contrasting?
What philosophical or theoretical model did you use?
III. Findings
What did you discover in the course of your research?
1. Discussion/Conclusion
How are your findings meaningful?
What new discoveries have you made?
How does your work contribute to the discourse?
General Tips for Writing Abstracts
The best way to improve your abstract writing skills is to read more abstracts. When you read other abstracts, you will understand more about what is expected, and what you should include or leave out from the abstract.
Reading abstracts helps you become more familiar with the tone and style, as well as the structure of abstracts.
Write your abstract after you have completed your research.
Many successful abstracts actually take the first sentence from each section of your research, such as the introduction/background, review of literature, methods, results, discussion, and conclusion.
Although it is a good idea to write the results of your original research, avoid giving too much detail. Instead, focus on what really matters.
A good abstract is like an elevator pitch.
While there is no absolute rule for how long an abstract should be, a general rule of thumb is around 100-150 words. However, some descriptive abstracts may be shorter than that, and some informative abstracts could be longer.
How to Write a Synopsis
Writing a synopsis involves summarizing a work’s key elements, including the narrative arc, major plot points, character development, rising action, and plot twists. Here’s a step-by-step guide on how to create a compelling synopsis.
- Outline the Narrative Arc: Start by defining your story’s beginning, middle, and end. This includes the introduction, rising action, climax, falling action, and resolution.
- Identify Major Plot Points: Major plot points are crucial events that propel your story forward. Identify these critical moments and explain how they contribute to the narrative arc.
- Discuss Character Development: Characters are the backbone of your story. Describe your characters at the start of the story and demonstrate how they evolve by the end.
- Illustrate Rising Action: The rising action is a series of events that lead to the climax of your story. Ensure to discuss these events and how they build suspense and momentum.
- Include Plot Twists: If your story has unexpected turns or surprises, highlight these plot twists in your synopsis. However, ensure these twists aren’t revealed too abruptly.
Remember, a synopsis should provide a complete overview of your story. It’s different from a teaser or back cover blurb — your objective isn’t to create suspense, but to succinctly present the whole narrative.
How Long Should a Summary Be
The length of a summary varies based on the complexity and length of the original work. However, as a rule of thumb, a summary should ideally be no more than 10-15% of the original text’s word count. This ensures you cover the significant plot points, character development, narrative arc, rising action, and plot twists without going into excessive detail.
For instance, if you’re summarizing a 300-page novel, your summary may be about 30 pages. If you’re summarizing a short 5-page article, a half-page to one-page summary should suffice.
Remember, the goal of a summary is to condense the source material, maintaining the core ideas and crucial information while trimming unnecessary details. Always aim for brevity and clarity in your summaries.
Abstracts are even shorter versions of executive summaries. Although abstracts are brief and seem relatively easy, they can be challenging to write. If you are struggling to write your abstract, just consider the main ideas of your original research paper and pretend that you are summarizing that research for a friend.
If you would like more examples of strong abstracts in your field of research, or need help composing your abstract or conducting research, call a writing tutor.
“Abstracts,” (n.d.). The Writing Center. https://writingcenter.unc.edu/tips-and-tools/abstracts/
Koopman, P. (1997). How to write an abstract. https://users.ece.cmu.edu/~koopman/essays/abstract.html
University of Massachusetts, Amherst (n.d.). Writing an abstract.
“Writing Report Abstracts,” (n.d.). Purdue Online Writing Lab. https://owl.english.purdue.edu/owl/resource/656/1/
Take the first step to becoming a better academic writer.
Writing tools.
- How to write a research proposal 2021 guide
- Guide to citing in MLA
- Guide to citing in APA format
- Chicago style citation guide
- Harvard referencing and citing guide
- How to complete an informative essay outline

Why Using Chat-GPT for Writing Your College Essays is Not Smart

The Importance of an Outline in Writing a Rhetorical Analysis Essay

A Guide to Choosing the Perfect Compare and Contrast Essay Topic

Analyzing Exemplary Paraphrase Examples
- EXPLORE Random Article
How to Write a Scientific Abstract
Last Updated: March 29, 2019 Approved
wikiHow is a “wiki,” similar to Wikipedia, which means that many of our articles are co-written by multiple authors. To create this article, 10 people, some anonymous, worked to edit and improve it over time. wikiHow marks an article as reader-approved once it receives enough positive feedback. This article received 13 testimonials and 90% of readers who voted found it helpful, earning it our reader-approved status. This article has been viewed 161,345 times.
A scientific abstract summarizes your research paper or article in a concise, clearly written way that informs readers about the article's content. Researchers use abstracts to determine whether a paper is relevant to their work and/or decide which papers to acquire and read. For academic conferences, participants only receive copies of the abstracts in proceedings. When readers search through electronic databases for articles, the abstract is usually the sole part of the paper that they see without cost. Typically 200-250 words, a scientific abstract consists of five key parts: title and author information, background, methods, results, and conclusions. [1] X Trustworthy Source PubMed Central Journal archive from the U.S. National Institutes of Health Go to source
Preparing to Write an Abstract

Structuring an Abstract

- Think of the research paper as having investigated a particular scientific question. Other researchers will value knowing your research question.
- Your background section should answer questions like: What did I study? Why is my research question important? What did my field of study know about my research question before I did this study? How will this study advance knowledge in our field? [6] X Trustworthy Source PubMed Central Journal archive from the U.S. National Institutes of Health Go to source
- Try to use an active voice and reduce passive language throughout your abstract. For example, write: "I interviewed Cassandra" instead of "Cassandra was interviewed by me."
- Minimize use of pronouns like "I" or "we." Write about "the study," "this paper examines," or "this research" instead of "my study" or "I write about..."
- Keep your abstract in the past or present tense but not in the future. For instance, do not write: "this paper will examine" but "this paper examines" or "the results showed."

- What was the research design?
- How long did the study last?
- What was the sample size?
- How did you recruit participants?
- What was the research setting? [7] X Trustworthy Source PubMed Central Journal archive from the U.S. National Institutes of Health Go to source

- Some organizations, journals, or conferences require a special format for the title, which could be all uppercase letters, bolded, or italics. [12] X Research source www.med.upenn.edu/focus/.../WritingaResearchAbstract_CompI-B.doc
Checking Style and Flow

- Read the abstract as if you were another researcher deciding whether to read your paper. Do you find the abstract has the right information to help you decide whether to read it? If not, ask yourself what is missing.

- Be sure to place commas and periods within quotation marks, e.g. "Milton said." instead of "Milton said".
- Do not end sentences with prepositions (of, for, about).
- Vary your verbs and nouns from sentence to sentence and use a print or online thesaurus for synonyms in order to not sound repetitive.
- Avoid vague adjectives like "very" and "many." Try to quantify your findings with specific numbers or conditions that offer comparisons. For example, "135 interlocutors participated" or "Subject A's performance was thirty percent better than Subject B's performance."
- Written years should not have apostrophes. Thus, write "1990s" rather than "1990's."
- Eliminate unnecessary content and add any missing important pieces of information. [13] X Research source

Community Q&A

- Do not include references to other papers, non-standard abbreviations, or any kinds of illustrations. For abbreviations you do include, always spell them out the first time you use them unless they are in common use. [14] X Research source www.acponline.org/residents_fellows/competitions/abstract/prepare/res_abs.htm Thanks Helpful 2 Not Helpful 0
- Review the abstract guidelines for the specific journal or scholarly discipline for which you are writing. Most disciplines (e.g. biology or sociology) have their own stylistic conventions for abstracts. Use other abstracts from your field as examples for style. Thanks Helpful 2 Not Helpful 0
- If you write an abstract for a research paper that you did not write, remember that it is not your job to review the paper, criticize its methods, or offer your opinion on the importance or relevance of the research. Thanks Helpful 1 Not Helpful 2
You Might Also Like

- ↑ http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3136027/
- ↑ https://owl.english.purdue.edu/owl/resource/706/1/
- ↑ www.med.upenn.edu/focus/.../WritingaResearchAbstract_CompI-B.doc
- ↑ http://abacus.bates.edu/~ganderso/biology/resources/writing/HTWsections.html#abstract
- ↑ www.acponline.org/residents_fellows/competitions/abstract/prepare/res_abs.htm
About this article
When you write an academic abstract, start with 1 to 3 introductory sentences that explain the study’s topic, purpose, and research questions. Then, follow up with 2 to 3 sentences on how you conducted your study, including its duration and sample size. Next, write 1 to 2 sentences on the results of the study. Finally, conclude with 1 to 2 sentences on the main point and impact of the research. To learn how to choose a title and edit your work, scroll down . . . Did this summary help you? Yes No
Reader Success Stories

Andoni Henderer
Dec 11, 2016
Did this article help you?

Eric Murillo
May 30, 2018

Ncanyane D. Leburu
Aug 13, 2017

Jane Bongiorno
Oct 13, 2016

May 31, 2016

- About wikiHow
- Terms of Use
- Privacy Policy
- Do Not Sell or Share My Info
- Not Selling Info


How to write an abstract for a lab report

If you have ever written a lab report, you know that it can be a difficult task. But one of the most challenging parts of writing a lab report is composing the abstract. The abstract is a brief summary of your entire report, and it must be concise and to-the-point. In this blog post, we will walk you through the process of writing an abstract for a lab report. We will provide tips and guidance to help you create an effective abstract that accurately represents your work.
The purpose of an abstract in a laboratory paper is to give the reader a brief overview of the lab report. It should be a single paragraph that provides an overview of the purpose, method, results, and conclusion of the experiment. While the abstract should be brief, it should also be detailed enough to give readers a good sense of what was done in the lab and what the results were.
- Read more: How to write a research paper abstract
What is an abstract for a lab report?
The abstract for a lab report is a short summary of the entire report. It should include the purpose of the experiment, the methods used, the results obtained, and the conclusion drawn from the data. The abstract should be written in clear, concise language and should be no more than a few paragraphs long. When writing the abstract, keep in mind that it should be able to stand alone; readers should be able to understand the main points of the report without reading the rest of it. With this in mind, make sure to include all essential information and avoid using jargon or technical terms that might not be familiar to everyone. By following these tips, you can write an effective abstract that will help your lab report stand out.
- How to write a lab report for chemistry + Examples
- How to write a conclusion for a lab report + examples
- How to write an introduction for a lab report
- How to write a biology lab report + examples
- How to write a discussion in a lab report + examples
- How to write an executive summary for a lab report + examples
How to Write an Abstract
A lab report is a great way to share your scientific findings with the world. But how do you write a good lab report? The answer lies in the abstract. The abstract is a brief summary of your report, and it is often the only part of the report that people will read. So how do you write a good abstract?
Here are 5 steps on how to write a lab report abstract:
1. Introduce the topic:
Start with a brief introduction to your topic. Introducing the topic is the first step in writing an abstract for a lab report. This can be done by providing a brief overview of the experiment. The introduction should also give some context for the results, explaining why they are important and how they relate to previous work in the field. By providing this information up front, readers will be able to better understand and appreciate the significance of the results. In addition, the introduction will provide a roadmap for the rest of the paper, helping to keep the reader on track as they read through the rest of the report.
2. State the research question:
State your research question or hypothesis.
In any scientific paper, the research question must be stated clearly and concisely. In a lab report, this is usually done in the Introduction section. The research question should be specific enough that it can be answered within the scope of the experiment, but broad enough that it is of interest to others in the field.
For example, a good research question for a paper on the effect of temperature on the rate of photosynthesis might be “What is the optimum temperature for photosynthesis in Elodea leaves?”
Once the research question has been identified, it should be used to guide the rest of the experiment. Every step taken during the experiment should be aimed at answering the research question.
3. Describe your methods and results.
The next step in writing a lab report is to briefly describe your methods and results. In this section, you should provide a brief overview of the experiments you conducted and the data you collected.
However, you should not go into too much detail, as this will be covered in the later sections of the report.
Instead, focus on providing a clear and concise description of your work. This will help to give your reader a general understanding of your findings and allow them to follow your report more easily.
4. Briefly discuss of your findings.
The second step in writing a lab report is to briefly discuss your findings. This section should be relatively short, as you will go into greater detail in the results section. In this section, you should simply state what you found and how it relates to your hypothesis.
For example, if you were testing the effects of different fertilizers on plant growth, you would state whether or not the fertilizer had an effect on the plants.
If you found that one type of fertilizer caused the plants to grow more quickly, you would mention this in the discussion section. However, you would not go into great detail about the exact results of the experiment in this section. Instead, that information will be presented in the results section.
5. Conclude the abstract.
The conclusion of your abstract should be a brief statement of the implications of your work. This is where you tell the reader what your work means for the field as a whole.
For example, if you’ve discovered a new method for synthesizing a particular compound, you would want to briefly describe how your method could be used in other areas of research.
Similarly, if you’ve performed an experiment that has yielded new insights into a particular phenomenon, you would want to describe how your findings could be applied to other situations.
In short, the conclusion of your abstract should offer a brief glimpse into the broader significance of your work.
6 Tips for Writing a Good Lab Report Abstract
Here are 6 tips for writing a good abstract for a lab report paper:
- The abstract should be written last, after you have finished your report.
- Keep it concise- an abstract should be no more than a paragraph, and should ideally be around 200 words.
- Start by stating the purpose of the lab report in the form of a research question or hypothesis. Then, briefly describe the methods you used to answer this question or test this hypothesis.
- Follow this with a summary of your results, and finally, state your conclusion. Be sure to answer any questions posed in the lab report prompt.
- The language you use in the abstract should be clear and concise- avoid jargon and technical terms where possible.
- The tone of the abstract should be neutral – it should not be too positive or negative.
If you follow these tips, you should be well on your way to writing a great lab report abstract!
Examples of Abstracts in Lab Reports
An abstract is a concise summary of a lab report. It should be a single paragraph that provides an overview of the purpose, method, results, and conclusion of the experiment. While the abstract should be brief, it should also be detailed enough to give readers a good sense of what was done in the lab and what the results were.
Here are two examples of abstracts from lab reports:
Lab Report Abstract Example 1:
In this experiment, we tested the effects of different concentrations of acid on the rates of reaction for three different metals. We found that the rate of reaction increased as the concentration of acid increased. This suggested that the type of metal did not have a significant effect on the rate of reaction.
Lab Report Abstract Example 2:
In this experiment, we tested the hypothesis that plants grown in soil with higher levels of nitrogen would grow taller than those grown in soil with lower levels of nitrogen. We found that plants grown in soil with higher levels of nitrogen did indeed grow taller than those grown in soil with lower levels of nitrogen. This supported our hypothesis and suggest that nitrogen is an important factor in plant growth.
In conclusion, an abstract is a brief summary of a lab report. It should be clear, concise, and provide an overview of the purpose, method, results, and conclusion of the experiment. By following these tips, you can write a great lab report abstract!
Lab report abstract writing help
Do you need help writing a lab report abstract? Our lab report writers can assist you with any aspect of your project- from start to finish!
Click to here to ask a question and get answers online – to learn more about our services. We look forward to helping you with your lab report abstract writing!
Resources & Further Readings
- Abstract- Richmond
- The Lab Report – University of Toronto Writing Advice
- Lab Report Writing – LibGuides at Phoenix College
- The Writing Center | Writing an Abstract | Guides
- Writing Lab Reports – Hunter College
- Scientific Abstracts – UConn Physics

Author: tutlance
How to write preface for project report, dissertation topics, related guides, how to write a discussion in a lab..., how to write a biology lab report +..., how to write a lab report for chemistry..., how to write an introduction for a lab..., how to write a conclusion for a lab..., how to write an executive summary for a....
- Homework Help
- Online Tutors
- Essay Writing Services
- Do My Math Homework
- Assignment Help
- Take My Online Math Class
- Take My Exam
- Take My Test
- Take My Course
- Dissertation Services
- Essay Introduction
- Essay Thesis Statement
- Essay Hooks
- Essay Conclusion
- Essay Revision
- Essay Title
- Essay Format
- Essay Outline
- Essay Cover Page
- Essay Topics
- Classification Essay
- Analytical Essay
- Cause and Effect Essay
- Expository Essay
- Descriptive Essay
- Argumentative Essay
- Compare and Contrast Essay
- Definition essay
- Narrative essay
- Persuasive essay
- Reflective essay
- Literary analysis essay
- Proposal essay
- Process essay
- Reflection Paper
- Evaluation Essay
- Exemplification essay
- Illustration Essay
- Informative essay
- Rhetorical analysis essay
- Review essay
- Scholarship essay
- Dissertation
- Annotated Bibliography
- Research Paper
- Research Paper Topics
- Position Paper
- Speech Writing
- Summary Writing
- Story writing
- Swot Analysis
- Resume Writing
- Business Plan
- Grant Writing
- Book Writing
- Personal Statement
- Writing Process
- Research Process
- Online Tutoring Subjects
- Become a Tutor
- Forgot Password?
Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.
- View all journals
- My Account Login
- Explore content
- About the journal
- Publish with us
- Sign up for alerts
- Open access
- Published: 07 June 2023
Health system-scale language models are all-purpose prediction engines
- Lavender Yao Jiang ORCID: orcid.org/0000-0003-2464-3281 1 , 2 ,
- Xujin Chris Liu 1 , 3 ,
- Nima Pour Nejatian 4 ,
- Mustafa Nasir-Moin ORCID: orcid.org/0000-0002-0389-1852 1 ,
- Duo Wang 5 ,
- Anas Abidin ORCID: orcid.org/0000-0003-0032-0664 4 ,
- Kevin Eaton 6 ,
- Howard Antony Riina 1 ,
- Ilya Laufer 1 ,
- Paawan Punjabi 6 ,
- Madeline Miceli 6 ,
- Nora C. Kim 1 ,
- Cordelia Orillac 1 ,
- Zane Schnurman 1 ,
- Christopher Livia 1 ,
- Hannah Weiss 1 ,
- David Kurland ORCID: orcid.org/0000-0002-4074-7497 1 ,
- Sean Neifert 1 ,
- Yosef Dastagirzada ORCID: orcid.org/0009-0003-3237-6037 1 ,
- Douglas Kondziolka 1 ,
- Alexander T. M. Cheung ORCID: orcid.org/0000-0003-0946-3493 1 ,
- Grace Yang 1 , 2 ,
- Ming Cao 1 , 2 ,
- Mona Flores ORCID: orcid.org/0000-0002-7362-3044 4 ,
- Anthony B. Costa 4 ,
- Yindalon Aphinyanaphongs 5 , 7 ,
- Kyunghyun Cho 2 , 8 , 9 , 10 &
- Eric Karl Oermann ORCID: orcid.org/0000-0002-1876-5963 1 , 2 , 11
Nature volume 619 , pages 357–362 ( 2023 ) Cite this article
78k Accesses
15 Citations
1330 Altmetric
Metrics details
- Computational science
- Translational research
This article has been updated
Physicians make critical time-constrained decisions every day. Clinical predictive models can help physicians and administrators make decisions by forecasting clinical and operational events. Existing structured data-based clinical predictive models have limited use in everyday practice owing to complexity in data processing, as well as model development and deployment 1 , 2 , 3 . Here we show that unstructured clinical notes from the electronic health record can enable the training of clinical language models, which can be used as all-purpose clinical predictive engines with low-resistance development and deployment. Our approach leverages recent advances in natural language processing 4 , 5 to train a large language model for medical language (NYUTron) and subsequently fine-tune it across a wide range of clinical and operational predictive tasks. We evaluated our approach within our health system for five such tasks: 30-day all-cause readmission prediction, in-hospital mortality prediction, comorbidity index prediction, length of stay prediction, and insurance denial prediction. We show that NYUTron has an area under the curve (AUC) of 78.7–94.9%, with an improvement of 5.36–14.7% in the AUC compared with traditional models. We additionally demonstrate the benefits of pretraining with clinical text, the potential for increasing generalizability to different sites through fine-tuning and the full deployment of our system in a prospective, single-arm trial. These results show the potential for using clinical language models in medicine to read alongside physicians and provide guidance at the point of care.
Physicians make difficult decisions every day requiring the integration of a tremendous amount of information. The information needed to make these medical decisions is scattered across various records, for example, a patient’s medical history and laboratory and imaging reports. When physicians perform their work, however, all of this information is ultimately integrated into the notes written by physicians to document and summarize patient care.
Clinical predictive models are frequently derived from rules that have existed for decades 6 , 7 , 8 , 9 , as well as from machine learning methods 10 , 11 , 12 , with most relying on structured inputs pulled from the electronic health record (EHR) or direct clinician inputs. This reliance on structured inputs introduces complexity in data processing, as well as in model development and deployment, which in part is responsible for the overwhelming majority of medical predictive algorithms being trained, tested and published, yet never deployed to assess their impact on real-world clinical care. This is frequently referred to as the ‘last-mile problem’ (refs. 1 , 2 , 3 ).
One of the most exciting recent developments in modern artificial intelligence (AI) research is large language models (LLMs). These massive neural networks (with millions or even billions of parameters) have been shown to obtain impactful results on a wide range of problems that rely on the reading and interpretation of human language. Several styles of LLMs have been developed over the past few years, broadly ranging from encoder models (such as BERT 4 ) to decoder models (such as GPT3; ref. 5 ). We theorized that LLMs could potentially solve the last-mile problem in medical predictive analytics by simply reading the notes written by physicians, thereby immediately accessing a comprehensive description of a patient’s medical state to provide decision support at the point of care across a wide range of clinical and operational tasks.
Here we present our results from developing, evaluating, deploying and prospectively assessing NYUTron, an LLM-based system that can integrate in real time with clinical workflows centred around writing notes and placing electronic orders. Our approach relies on the fact that all clinically useful data and medical professionals’ decision-making processes can be found as structured or unstructured text in the EHR (for example, as notes, laboratory results and reports on studies). Our approach leverages recent advances in natural language processing that suggest that sufficiently scaled, self-supervised LLMs can outperform strongly supervised approaches on non-medical predictive tasks 4 , 5 , 13 . We investigate our hypothesis in the NYU Langone Health System (‘NYU Langone’), a large multi-borough hospital system with a diverse patient population in New York, with 4 urban hospitals and 350 outpatient sites. We assess NYUTron on a battery of five tasks, including three clinical and two operational tasks (30-day all-cause readmission prediction, in-hospital mortality prediction, comorbidity index prediction, length of stay (LOS) prediction and insurance denial prediction) and provide a detailed analysis of our 30-day readmission task to look at questions of data efficiency, generalizability, deployability and potential clinical impact. By rethinking all of medical predictive analytics (see Supplementary Information section 1.1 for previous works) as a natural language processing problem, we show that it is possible to use LLMs as universal prediction engines for a wide range of medical predictive tasks.
Language model-based clinical prediction
Our language model-based approach has four steps: data collection, pretraining, fine-tuning and deployment. In the first step (Fig. 1a ), we collected a vast set of unlabelled clinical notes and five task-specific labelled clinical notes from the NYU Langone EHR. Unlike other studies, our datasets come from the entire hospital system with a diverse patient population from different clinical departments. Our large unlabelled dataset, ‘NYU Notes’, comprises 7.25 million clinical notes (for example, radiographic reads, history and physicals) from 387,144 patients across four hospitals, resulting in a 4.1 billion-word corpus curated from January 2011 to May 2020. Each one of our labelled fine-tuning sets contains 1–10 years of inpatient clinical notes (55,791–413,845 patients, 51–87 million words) with task-specific labels (2–4 classes). See Extended Data Table 1 for dataset statistics.

a , We queried the NYU Langone EHR for two types of datasets. The pretraining dataset, NYU Notes, contains 10 years of inpatient clinical notes (387,144 patients, 4.1 billion words). There are five fine-tuning datasets. Each contains 1–10 years of inpatient clinical notes (55,791–413,845 patients, 51–87 million words) with task-specific labels (2–4 classes). b , We pretrained a 109 million-parameter BERT-like LLM, termed NYUTron, on the entire EHR using an MLM task to create a pretrained model for medical language contained within the EHR. c , We subsequently fine-tuned the pretrained model on specific tasks (for example, 30-day all-cause readmission prediction) and validated it on held-out retrospective data. d , Lastly, the fine-tuned model was compressed into an accelerated format and loaded into an inference engine, which interfaces with the NYU Langone EHR to read discharge notes when they are signed by treating physicians.
In the second and third steps (Fig. 1b,c ), we pretrained and fine-tuned an LLM for each downstream task using a bidirectional encoder model known as BERT (Bidirectional Encoder Representation with Transformer) and a masked language modelling (MLM) objective on the NYU Notes dataset 11 until the validation loss plateaued. The MLM objective randomly masks words or subwords in clinical notes and trains the language model to fill in the masked word correctly. Next, using the fine-tuning dataset, we fine-tuned the pretrained model (termed ‘NYUTron’) to predict the task label using the relationships learned in pretraining with clinical notes.
In the fourth step (Fig. 1d ), we deployed our best model to a high-performance inference engine, NYUTriton, that interfaces with the NYU Langone EHR. Deployment enabled real-time LLM-guided inference at the point of care. In a single-arm, non-interventional, prospective trial, we validated NYUTron’s performance on 30-day readmission prediction in a real-world environment and assessed its potential clinical impacts.
Overall performance on five tasks
To assess the breadth of NYUTron’s applicability, we evaluated NYUTron’s performance on five tasks retrospectively. We trained with the full dataset and evaluated performance with two test sets: (1) a random test set (clinical notes sampled from the same time as the training data) and (2) a temporal test set (clinical notes sampled from the future of the training data). The temporal test set more closely resembles the deployment scenario, in which inference data come from the future of the training data. Our battery of tasks consisted of three clinical tasks and two operational tasks, as shown in Fig. 2a . We compared NYUTron against structured baselines, which forward structured features used by traditional clinical predictive models into an extreme gradient-boosted tree 14 model.

a , The five tasks include three clinical tasks and two operational tasks. b , On readmission prediction, NYUTron had a median AUC of 79.9% ± 0.168% with a 5.36% improvement. On in-hospital mortality prediction, NYUTron had a median AUC of 94.9% ± 0.168% with a 7.43% improvement. On comorbidity index imputation, NYUTron had an OVR median AUC of 89.4% ± 0.275%. A confusion matrix is shown on the right. c , On binned LOS prediction, NYUTron had a median AUC of 78.7% ± 0.179% with a 12.3% improvement from the structured baseline. On insurance denial prediction, NYUTron had a median AUC of 87.2% ± 0.246% with a 14.7% improvement. For b , c , the height of the error bar is the median AUC and the half-width of the error bar is 1 s.d. The grey points are individual data points from n = 5 experiments using distinct random seeds.
NYUTron is capable of being extended to multiple clinical and operational tasks. Figure 2b and Fig. 2c show that, on prediction tasks (in-hospital mortality, readmission, LOS and insurance denial), NYUTron had an area under the curve (AUC) of 78.7–94.9%, with an improvement of 5.36–14.7% in AUC from traditional clinical predictive models. On the comorbidity index imputation task, NYUTron had a median AUC of 89.4% ± 0.275%. We first present our results across four of the tasks and conclude with a focused look at readmission prediction that addresses questions of data efficiency, model generalizability and deployment in a real-world environment.
NYUTron is capable of predicting risk of in-hospital mortality on admission and imputing a comorbidity index. The task of in-hospital mortality prediction was to estimate (at admission) the likelihood of a patient’s death during the present inpatient encounter. Figure 2b shows that, for in-hospital mortality prediction, NYUTron had a median AUC of 94.9% ± 0.168%, with a 7.43% improvement from its structured baseline based on Simplified Acute Physiology Score (SAPS2) 15 and Acute Physiology and Chronic Health Evaluation (APACHE2) 16 features such as age and mean heart rate. The task of comorbidity index imputation was to predict (at admission) the Charlson comorbidity index (CCI) 17 with no available structured features for chronic diseases. We framed this as a data imputation problem, as 22% of our dataset lacked CCI scores and this was a known area for documentation improvement (see Supplementary Information section 2.3 for more context). We discretized the index into four bins according to the original paper’s grades of severity (0, none; 1–2, mild; 3–4, moderate; ≥5, severe). Figure 2b shows that, on comorbidity imputation, NYUTron had a median AUC of 89.4% ± 0.275% and 88% precision when identifying patients whose CCI score was 0.
NYUTron can also be used for operational endpoints and to predict in-patient LOS and insurance claim denial on admission. The task of LOS prediction was to predict (at admission) the likely range of days a patient would stay in the hospital. We discretized LOS into four bins (0–25% quantile, 25–50% quantile, 50–75% quantile, >75% quantile). Figure 2c shows that, for LOS prediction, NYUTron had a median one-versus-rest (OVR) AUC of 78.7% ± 0.179%, with a 12.3% improvement from the structured baseline, which used an available subset of ‘Lisbon Portugal’ features 18 . The task of insurance claim denial prediction was to predict (at admission) whether the insurance claims submitted for an encounter would be accepted or initially denied. Figure 2c shows that, for insurance denial prediction, NYUTron had a median AUC of 87.2% ± 0.246%, with a 14.7% improvement from the structured baseline, which used an available subset of ‘claim form’ features 19 such as age and insurance provider. NYUTron is also capable of predicting different types of denials from both admission notes and discharge notes with similar performance (Supplementary Information section 2.2 ).
Detailed analysis on readmission
To better understand NYUTron’s performance, we carried out a detailed analysis of 30-day all-cause readmission prediction. The task of readmission prediction is to predict (at discharge) the likelihood of a patient coming back to the hospital within 30 days and is a well-studied problem in the medical informatics literature (see Supplementary Information section 2.1 for more details on the readmission prediction task). Figure 2b shows that, for 30-day all-cause readmission prediction, NYUTron had a median AUC of 79.87% ± 0.168%, with a 5.36% improvement from its structured baseline, which used LACE 20 features (a mnemonic for LOS, acuity of admission, Charlson comorbidity index and number of emergency department visits in the past 6 months). We performed five additional evaluations in both retrospective and prospective settings: (1) a human comparison with six attending physicians for prediction of readmission for 20 patient cases sampled from a random split, (2) a study of NYUTron’s scaling properties with respect to data in which NYUTron and other models were compared using a different number of fine-tuned data points, (3) an assessment of NYUTron’s cross-site generalizability using pretraining, fine-tuning and test data from different locations, (4) a prospective, single-arm, non-interventional study to evaluate NYUTron’s deployability and (5) a qualitative evaluation by a physician panel of NYUTron’s prospective performance to assess clinical impacts.
Retrospective study of readmission
On small samples, NYUTron was competitive with a small group of physicians at predicting 30-day readmission. We tested a group of six physicians at different levels of seniority against NYUTron in a head-to-head comparison to establish a baseline difficulty for predicting 30-day all-cause readmission at the time of discharge. Discharge summaries ( n = 20, including 11 positive cases and 9 negative cases) were sampled from a random split and uploaded to an online evaluation platform. Median physician performance was worse than that of NYUTron (Fig. 3a ). For physicians and NYUTron, the median false positive rate (FPR) was 11.11%, whereas the median true positive rate (TPR) was 50% for physicians compared with 81.82% for NYUTron. Physicians had a median F1 score of 62.8% and substantial variance of 22.2% compared with NYUTron, which had a median F1 score of 77.8%.

a , On 20 cases sampled from a random split, we compared NYUTron’s TPR and FPR with those for six physicians. NYUTron (orange triangles) had a higher TPR and the same FPR when compared with the median physician performance (green circles). The error band for AUC ranges from the minimum to maximum, and the orange crosses indicate TPR and FPR using all possible thresholds. We chose NYUTron’s threshold on the basis of validation data. b , Comparison of the temporal test AUCs of different pretrained LLMs with an increasing number of fine-tuning examples. For simplicity, we omit the variance and only plot the median performance of five trials. Differences in median performance with 100 and 1,000 examples are less notable because AUCs with sparse fine-tuning examples have high variance (at 100 examples, we had 4.26% to 9.56% variance; at 1,000 examples, we had 0.44% to 9.46% variance). AUC variance decreases with more fine-tuning examples. The horizontal dashed line at 0.75 corresponds to the threshold for performance. See alternative presentations in Extended Data Fig. 7 . c , d , Temporal test performance of NYUTron using pretraining, fine-tuning and test data from different sites. For both the Manhattan and Brooklyn tests, the column corresponding to local fine-tuning shows better performance than that with external fine-tuning. Each entry in c , d is presented as the mean ± 1 s.d. for n = 5 experiments using distinct random seeds.
The random split does not resemble the deployment scenario, in which the test data come from the future of the training data. We therefore created a temporal split to simulate deployment and observed a meaningful difference in test statistics compared with the random split (the random test AUC was 84.13%, whereas the temporal test AUC was 80.2%), confirming the importance of this second testing phase (further comparison in Extended Data Fig. 1 ).
NYUTron is competitive with traditional models and other LLMs. We evaluated the effectiveness of NYUTron by comparing its test performance on the temporal split against that of a traditional model and four different types of LLMs. NYUTron had the highest AUC when fine-tuned with the full dataset (Fig. 3b ), with a median AUC of 79.87% ± 0.17%, which was similar to the clinical+web-wiki+bio AUC of 80.14% ± 0.26%. Compared with LLMs pretrained with non-clinical text (web-wiki+bio and web-wiki), NYUTron’s median AUC was 2.37% to 3.23% higher. Compared with the traditional model that uses structured features (lace+xgb), NYUTron had a 5.36% higher AUC. Compared with a model using traditional natural language processing (NLP) embedding (tf-idf+xgb), NYUTron had a 12.8% higher median AUC (Extended Data Fig. 2a ).
An LLM trained on unstructured clinical notes better scales with data than traditional structured models. Compared with lace+xgb, NYUTron benefits from an increasing amount of labelled examples and achieved a better AUC when fine-tuned with the full dataset. Figure 3b shows that lace+xgb (dashed yellow line) and NYUTron (solid green line) had similar AUCs at 100 and 1,000 examples. However, NYUTron’s AUC consistently improved with more examples whereas lace+xgb’s AUC started to plateau (from 100 to 1,000 examples, NYUTron’s AUC increased by 7.27% whereas that of lace+xgb increased by 3.98%; from 10,000 to 392,336 examples, NYUTron’s AUC increased by 2.15% whereas that of lace+xgb increased by 0.63%). With the full fine-tuning dataset, NYUTron had a 7.04% higher AUC than lace+xgb.
Pretraining on a large amount of unlabelled clinical notes contributes to performance. Compared with the randomly initialized LLM (random-init), NYUTron learns to generalize better from fewer examples. Figure 3b shows that, whereas NYUTron needed 10,000 examples to achieve an AUC of around 75%, random-init needed 100,000 examples. We also observed a similar trend in another clinical prediction task: NYUTron performed better than the random-init model (36.83% higher F1 score) and the non-clinically pretrained models (2.06% to 3.73% higher F1 score) on the clinical named entity recognition (NER) task from the 2012 i2b2 challenge (Extended Data Fig. 2b ).
It is beneficial to match the domain of the pretraining corpus and the domain of the fine-tuning corpus. Figure 3b shows three pieces of evidence: LLMs pretrained on non-clinical text (web-wiki and web-wiki+bio) had similar performance as random-init. A separate LLM, web-wiki+bio+clinical, had similar performance as NYUTron. Third, compared with LLMs pretrained on non-clinical text (web-wiki and web-wiki+bio), clinically pretrained LLMs (NYUTron and web-wiki+bio+clinical) learned to generalize better from fewer examples. See Extended Data Fig. 3 for comparison of the pretraining corpus.
Having a close domain match during pretraining is particularly beneficial in the low-data setting during fine-tuning. We compared two language models that were pretrained on clinical text from different hospital systems, NYUTron (NYU Langone Health) and web-wiki+bio+clinical (University of Florida). Figure 3b shows that, at 1,000 examples, NYUTron (the in-domain model) had a higher AUC for NYU Langone readmission prediction than web-wiki+bio+clinical (the out-of-domain model). Notably, NYUTron’s advantage disappeared as the number of fine-tuning examples increased, suggesting that sufficient in-domain fine-tuning can adapt models that were pretrained out of domain.
Clinical language models show generalizability to different sites through local fine-tuning. To investigate the robustness of NYUTron across clinical environments, we chose two hospitals that are geographically separated within the NYU Langone Health System. For brevity, we refer to Tisch Hospital in Manhattan as ‘Manhattan’, NYU Langone Hospital–Brooklyn as ‘Brooklyn’ and all four hospitals within the NYU Langone Health System (Manhattan, Brooklyn, NYU Langone Orthopedic Hospital and NYU Langone Hospital–Long Island) as ‘all sites’. We considered three LLMs pretrained on different sites: the first one was pretrained in Manhattan, the second one was pretrained in Brooklyn and the third one was pretrained on all sites. For each of the pretrained LLMs, we fine-tuned the LLM with a readmission dataset from either Manhattan or Brooklyn. Finally, we asked the fine-tuned LLM to predict readmission on the basis of discharge notes from either Manhattan or Brooklyn. Figure 3c,d shows that the LLM pretrained on all sites had the best performance on both ‘test Manhattan’ and ‘test Brooklyn’. For all the LLMs, fine-tuning with the local dataset (‘fine-tune Manhattan/Brooklyn’) led to a higher test AUC at the test site (‘test Manhattan/Brooklyn’) compared with fine-tuning at another site (‘fine-tune Brooklyn/Manhattan’). Therefore, pretraining with data from all sites and local fine-tuning is the best way to optimize performance. We performed additional analyses that showed that NYUTron is able to generalize to a different health system through local fine-tuning (Supplementary Information section 4.1 and Extended Data Fig. 4 ) and compared the robustness of NYUTron and lace+xgb with respect to training sites (Supplementary Information section 4.2 ). We also found that NYUTron is sensitive to notes from different clinical departments and patients with different demographics and that its performance fluctuates over months (Extended Data Figs. 5 and 6 ). The causes of the discrepancies can be very complex (discussed in Supplementary Information section 4.3 ) and will be studied in future work.
Prospective study of readmission
To assess NYUTron’s performance outside the development environment, we selected a model on the basis of the retrospective trial results and ran a prospective trial from January to April 2022. During this time period, we deployed NYUTron in an accelerated format and loaded it into an inference engine, which interfaces with the EHR, to read discharge notes as they were signed by treating physicians. In this period, there were 29,286 discharged encounters, with 3,271 patients (11.17%) returning within 30 days. NYUTron predicted 2,692 of the 3,271 readmissions (82.30% recall) with 20.58% precision. Figure 4a shows that NYUTron had an AUC of 78.70%.

a , NYUTron had an AUC of 78.70% in a prospective, single-arm, non-interventional trial with recall of 82.3% and precision of 20.6%. b , A panel of six physicians reviewed NYUTron’s results for potential clinical impact. Of 100 readmissions that were successfully identified by NYUTron, 61% were unplanned readmissions, 50% would have resulted in a penalty under CMS guidelines and 27% were preventable at the time of discharge according to the consensus opinion of the multi-specialty panel of physicians who reviewed cases from the prospective trial. See Supplementary Information section 2.1 for a discussion of the readmission label and the practical significance of the observed performance.
To gauge the potential clinical impact, a group of six physicians performed a qualitative evaluation of 100 randomly sampled readmitted cases that were captured by NYUTron following the trial’s conclusion. Physician review suggested that some true positive predictions by NYUTron are clinically meaningful, preventable readmissions. Overall, readmitted patients who were predicted to be readmitted were 6.02 times more likely to die in hospital and stay 2.93 days longer ( P < 10 −4 ). As shown in Fig. 4b , 61% of the predicted case were unplanned, and the mean predicted probabilities for these unplanned readmissions were lower than those for planned readmissions (31.9% ± 31.1% versus 82.1% ± 27.3%; P < 10 −4 ). Among the unplanned readmissions, 19.67% of patients experienced an adverse event or death on readmission, with 50% of these events considered preventable by the physician panel. From a financial standpoint, 81.9% of the unplanned readmissions would be penalized according to Centers for Medicare and Medicaid Services (CMS) guidelines. Among the penalizable cases, 54% were considered preventable. Notably, 3 of the 27 preventable readmissions had Clostridioides difficile enterocolitis, a contagious, healthcare-associated bacterial infection that causes 1 in 11 people over age 65 to die within 1 month 21 .
We present our work in developing, training, validating and deploying NYUTron, a health system-scale LLM designed and validated for clinical use. We demonstrate NYUTron’s performance on three clinical tasks (in-patient mortality prediction, comorbidity index prediction and readmission prediction) and two operational tasks (insurance claim denial prediction and inpatient LOS prediction). We also performed a detailed analysis of readmission prediction owing to its clinical and operational importance and its well-documented history in the medical informatics literature. We view the flexibility of our approach in using an encoder architecture (BERT), which relies on only unstructured text inputs to generate a single prediction, as being a virtue, and we anticipate many future tasks built on this fundamental paradigm to assist with multiple aspects of patient care and automating hospital operations.
An ethical consideration in deployment is that physicians and administrators could over-rely on NYUTron’s predictions owing to its seamless integration with existing medical workflows, thereby leading to undesirable outcomes. Further research is needed to optimize human–AI interactions, as well as development of standardized assessments for sources of bias or other unexpected failure points. Ongoing work from our group around measuring the similarity between language models’ sensitivity patterns and those of physicians through token-level perturbations of the clinical notes 22 is one among many such efforts.
Large, generative LLMs also present a unique opportunity for integration into medical workflows; however, they are highly dependent on user inputs and prompting 23 and are not as easily adapted for automation of basic clinical and operational tasks. The seamless integration into existing medical informatics workflows is a virtue of our approach, and we hope that this work presents itself as a flexible solution to the last-mile problem—any structured data algorithm can be reconceptualized and rapidly prototyped within this framework. As part of monitoring the impact of such a system on physician behaviour and on patients, there should be a level of continuous supervision to capture human–machine interactions, as well as mitigate the risk of model drift over time. We discuss our implementation of such a system in Supplementary Information section 5 .
Our approach of using a smaller (<1 billion parameters) encoder language model trained on highly tailored data represents a marked departure from the current trend in language model research that focuses on massive (>1 billion parameters), generative models pretrained on large, non-specific datasets. Nonetheless, even relatively small LLMs, such as the ones used in this study, require a substantial amount of compute time for pretraining. Our pretraining used 24 NVIDIA A100 GPUs with 40 GB of VRAM for 3 weeks, and our fine-tuning used 8 A100 GPUs for 6 hours per run. This amount of computation is not commonly accessible to research groups, although we note that it is less than that in similar LLM projects routinely pursued by industry research groups and that our results indicate that massive pretraining may not be necessary to obtain highly performant models. Our results show that high-quality datasets for fine-tuning are more valuable than pretraining, and, on the basis of our experimental results, we recommend that users locally fine-tune an externally pretrained language model when computational ability is limited. Regarding the choice for the externally pretrained model, we further recommend using a model pretrained with a large amount of in-domain clinical text, although we note that large, out-of-domain models can be highly performant, particularly when combined with in-domain fine-tuning. Work with larger decoder-based architectures has also demonstrated a benefit with fine-tuning on medical data or prompt tuning with chain of thought, instructions and related techniques 24 , 25 , which further emphasizes the necessity of accounting for the domain shift from general to medical text for LLM work in the medical sciences. Although we have not compared these approaches directly (which would require more medical text or fusion with general-domain text for training a compute-optimal model 26 ), we believe that this could be an interesting future direction for research and that, in the end, approaches combining these different approaches to language modelling may prove to be complementary depending on the use case.
The ultimate validation of our approach must come from randomized controlled trials of interventions tied to individual task predictions to assess their clinical impact and from user feedback as we continue to integrate NYUTron into health systems. As we plan this within our own health system, we recommend the consideration of different levels of intervention depending on the predicted risk of patients for each task. For instance, for a patient at low risk for 30-day readmission, follow-up calls could be scheduled; for a high-risk patient, care should be taken to limit premature discharge. All interventions should be decided on with physician supervision, although many of the operational uses can probably be fully automated.
It is a long-standing dream for physicians to have AI assistants observing care along with them and chiming in with predictions and advice. To take a step towards this futuristic vision, we trained an LLM, NYUTron, on the entire EHR of a large healthcare system to read physician notes and make several of these predictions across a wide range of clinical and operational tasks. We deployed NYUTron in a live healthcare environment and demonstrate its efficacy at predicting 30-day readmission while being integrated seamlessly into clinical workflows. We believe that this work opens the door to translating the progress in modern natural language processing and deep learning to improving the quality and affordability of healthcare, and we are excited to see what comes next.
Pretraining datasets
We created this dataset of unlabelled clinical notes directly from the NYU Langone EHR. The dataset contains 387,144 patients, 7,247,694 notes and 4,112,249,482 words in total. We built NYU Notes as follows: we wrote structured query language (SQL) scripts to query the NYU Langone EHR. We first prototyped the queries with an interactive web-based editor (Cloudera Hue) and then download the query results as comma-separated files (CSVs) to NYU Langone’s high-performance computing cluster. We included notes signed by medical professionals (physicians, residents, physician assistants, nurse practitioners and fellows) at Tisch Hospital, NYU Langone Hospital–Brooklyn, NYU Langone Hospital–Long Island and NYU Langone Orthopedic Hospital from 2011 to 2020 (inclusive). We excluded any notes that were derived from billing, labelled as invalid or empty. We split the notes into three sets, training, validation and test sets, with a ratio of 949:50:1. Lastly, we masked tokens with 15% probability to create masked text and labels.
NYU Notes–Manhattan
We created this dataset of unlabelled clinical notes as the subset of NYU Notes that were written in Tisch Hospital in Manhattan. The dataset contains 256,217 patients, 4,342,602 notes and 2,381,466,993 words in total.
NYU Notes–Brooklyn
We created this dataset of unlabelled clinical notes as the subset of NYU Notes that were written in NYU Langone Health–Brooklyn. The dataset contains 104,521 patients, 1,337,352 notes and 1,102,078,012 words in total.
Fine-tuning datasets
Nyu readmission.
We created this dataset of labelled discharge notes (with binary labels for readmission) from the NYU Langone EHR. Most of the notes from this dataset are a subset of NYU Notes, with additional discharge notes from 2021 for the temporal test. The dataset contains 413,845 patients, 506,740 notes and 487,395,462 words in total. We built this dataset as follows: for each encounter that ended between January 2011 and November 2021, we included its discharge note with a binary label for 30-day all-cause readmission. We assigned the ‘readmitted’ label if the patient had an admission note within 30 days of being discharged. To focus on modelling acute care readmission, we excluded discharge notes from the rehabilitation, dialysis and palliative care departments because these were not acute care admissions. We split the dataset into four sets: training, validation, test and temporal test sets. The first three sets were notes from January 2011 to May 2021, with a ratio of 8:1:1. The temporal test set included notes from June to December 2021. See Extended Data Fig. 8a for a visualization of the four-way split.
NYU Readmission–Manhattan
We created this dataset of unlabelled clinical notes as the subset of notes in the NYU Readmission dataset that were written in Tisch Hospital in Manhattan. The dataset contains 240,824 patients, 296,519 notes and 253,622,053 words.
NYU Readmission–Brooklyn
We created this dataset of unlabelled clinical notes as the subset of clinical notes from the NYU Readmission dataset that were written in NYU Langone Health–Brooklyn. The dataset contains 94,653 patients, 113,275 notes and 142,767,957 words.
NYU Mortality
We created this dataset of history and physical (H&P) notes with binary labels for in-hospital mortality from the NYU Langone EHR. Most of the notes from this dataset are a subset of NYU Notes, with additional H&P notes from 2021 for the temporal test. The dataset contains 371,922 patients, 469,162 notes and 484,467,141 words in total. We built this dataset as follows: for each encounter that ended between January 2011 and November 2021, we included its H&P note with a binary label for in-hospital mortality. We assigned the positive label if the patient’s discharge disposition was ‘expired’. We split the dataset into four sets: training, validation, test and temporal test sets. The first three sets were notes from January 2011 to May 2021, with a ratio of 8:1:1, and the temporal test set included notes from June to December 2021.
NYU Binned Comorbidity
We created this dataset of H&P notes with five class labels for hospital LOS from the NYU Langone EHR. Most of the notes from this dataset were a subset of NYU Notes, with additional H&P notes from 2021 for the temporal test. The dataset contains 327,039 patients, 403,579 notes and 422,485,417 words in total. The dataset contains fewer labelled encounters than the NYU Mortality and NYU Binned LOS datasets because 22% of the encounters had no International Classification of Diseases (ICD) codes to calculate the CCI score. This missingness motivated our task of predicting binned CCI score with a lack of structured ICD codes. We built this dataset as follows: for each encounter that ended between January 2011 and November 2021, we included its H&P note with a five-class label for binned CCI score. To generate the labels, we first calculated the comorbidity index using the ICD codes and the scoring function in ref. 27 . We then discretized the scores into five classes: we assigned label 0 for a comorbidity index below the 50% quantile (0 days), label 1 for a comorbidity index between the 50% and 75% quantile (1–2 days), label 2 for a comorbidity index between the 75% and 90% quantile (3–4 days), label 3 for a comorbidity index between the 90% and 99% quantile (4–7 days) and label 4 for a comorbidity index above the 99% quantile (>7 days). We split the dataset into four sets: training, validation, test and temporal test sets. The first three sets were notes from January 2011 to May 2021, with a ratio of 8:1:1, and the temporal test set included notes from June to December 2021.
NYU Binned LOS
We created this dataset of H&P notes with quantile labels for hospital LOS from the NYU Langone EHR. Most of the notes from this dataset were a subset of NYU Notes, with additional H&P notes from 2021 for the temporal test. The dataset contains 371,922 patients, 469,162 notes and 484,467,141 words in total. We built this dataset as follows: for each encounter that ended between January 2011 and November 2021, we included its H&P note with a binary label and a quantile label for LOS. For the quantile label, we assigned label 0 for an LOS below the 25% quantile (0–2 days), label 1 for an LOS between the 25% and 50% quantile (3 days), label 2 for an LOS between the 50% and 75% quantile (4–5 days) and label 3 for an LOS above the 75% quantile (>5 days). We split the dataset into four sets: training, validation, test and temporal test sets. The first three sets were notes from January 2011 to May 2021, with a ratio of 8:1:1, and the temporal test set included notes from June to December 2021.
NYU Insurance Denial
We created this dataset of H&P notes with binary labels for whether the patient’s insurance claim was initially rejected or directly approved. The dataset contains 54,563 patients, 55,791 notes and 51,270,256 words in total. We built this dataset as follows: for each encounter that occurred between May 1, 2021, and April 30, 2022, we included its H&P note with a binary label for insurance denial. We assigned a positive label if the patient’s insurance claim status was ‘final, adverse determination’ (claim was rejected by insurance and was again rejected following appeal) or ‘final, favorable determination’ (claim was rejected by insurance and approved following appeal). We split the dataset into four sets: training, validation, test and temporal test sets. The first three sets were notes from May 1, 2021, to February 30, 2022, with a ratio of 18:1:1. The temporal test set included notes from March 1 to April 30, 2022.
NYU Insurance Denial–Discharge Notes
We created this dataset of discharge notes with binary labels for whether the patient’s insurance claim was initially rejected or directly approved. The dataset contains 54,563 patients, 55,791 notes and 49,405,133 words in total. We built this dataset as follows: for each encounter that occurred between May 1, 2021, and April 30, 2022, we included its discharge note with a binary label for insurance denial. The label assignment and four-way split were the same as in the NYU Insurance Denial dataset.
NYU Insurance Eventual Denial, H&P
This dataset contained the same notes as the NYU Insurance Denial dataset, but the labels were different. The binary label indicated whether the patient’s insurance claim was eventually rejected (even after appeal) or was eventually approved (direct approval or approval after appeal).
NYU Insurance Eventual Denial–Discharge
This dataset contained the same notes as the NYU Insurance Denial–Discharge Notes dataset, but the labels were different. The binary label indicated whether the patient’s insurance claim was eventually rejected (even after appeal) or was eventually approved (direct approval or approval after appeal).
i2b2-2012 NER
This is an open dataset released by the Harvard Medical School as part of an annual clinical NLP challenge 28 . This dataset is a well-known benchmark in the clinical NLP community. The task is to identify and classify clinical concepts (for example, treatments), clinical departments (for example, surgery), occurrences of events (for example, admission) and evidentials (for example, the patient complained) from de-identified clinical notes from Beth Israel Medical Center in Boston. The dataset contains no more than 310 patients, 310 notes and 636,000 words. We downloaded the dataset as a compressed tar.gz file from the n2c2 data portal after our use application was approved.
MIMIC-III Readmission
This is an open dataset for an intensive care unit (ICU) EHR released by MIT and Boston Beth Israel Medical Center 29 . We collected a set of 52,726 discharge notes and created a 30-day all-cause readmission label by checking whether there was any subsequent encounter within 30 days. The readmission rate was 6%. We split the data into training, validation and test sets in a 8:1:1 ratio.
Deployment dataset
Nyu readmission–deployment.
This dataset consists of discharge notes with binary labels for readmission from our deployment engine and the NYU Langone EHR. From January to April 2022, every time a discharge note was signed by a physician, the note was sent to our custom inference engine for NYUTron’s prediction. The paired discharge note and prediction were recorded in a database. The database contained 27,376 patients, 29,287 notes and 34,669,963 words by the end of the study period.
Structured datasets
Nyu readmission–lace.
We created this dataset of structured LACE 30 features with binary labels for readmission for comparison against the unstructured models. The dataset contains structured features for all encounters in the NYU readmission dataset. LACE is a traditional clinical prediction rule for readmission with four features: LOS, acuity of readmission, Charlson comorbidity index, and number of recent emergency department visits in the past 6 months. We built the dataset as follows: for every encounter in the NYU Readmission dataset, we collected data on the four LACE features from the NYU Langone EHR. LOS was the difference (in days) between the discharge date and the admission date. Acuity of readmission was a binary feature indicating whether the patient was admitted to the emergency department. The comorbidity index was calculated with the ICD-9 or ICD-10 codes for chronic diseases, on the basis of the mapping algorithm in ref. 31 and the scoring function in ref. 27 . The number of emergency department visits was calculated from the patient’s encounter history up to 6 months before the admission date.
NYU Readmission–LACE, Manhattan
We created this dataset of structured LACE features from the subset of notes from the NYU Readmission–LACE dataset that were written in Tisch Hospital in Manhattan.
NYU Readmission–LACE, Brooklyn
We created this dataset of structured LACE features from the subset of notes from the NYU Readmission–LACE dataset that were written in NYU Langone Health–Brooklyn.
NYU Mortality–SAPS2 + APACHE2
We created this dataset of structured SAPS2 + APACHE2 features with binary labels for in-hospital mortality to compare against the unstructured data. The dataset contains a subset of structured SAPS2 + APACHE2 features for all encounters in the NYU Mortality dataset. SAPS2 + APACHE2 features are a subset of the features used in the SAPS2 model 15 and the APACHE2 model 16 for ICU mortality prediction. We selected the subset of features that were available in the NYU Langone EHR. We included the following 12 features: age (numerical), mean heart rate (numerical), systolic blood pressure (numerical), atrial temperature (numerical), blood urea nitrogen concentration (numerical), sodium concentration (numerical), potassium concentration (numerical), bilirubin concentration (numerical), white blood cell count (numerical), pH (numerical), creatine concentration (numerical) and haematocrit (numerical). We additionally included department specialty (categorical). We excluded the following features owing to their unavailability: PaO 2 /FiO 2 (ratio of arterial oxygen partial pressure to fractional inspired oxygen), whether the patient was on mechanical ventilation or continuous positive airway pressure (CPAP), bicarbonate concentration, urine output, Glasgow Coma Scale score, presence of metastatic cancer or haematological malignancy or AIDS, and whether the admission was scheduled.
NYU Binned LOS–Lisbon Portugal
We created this dataset of structured ‘Lisbon Portugal’ features with binary labels for in-hospital mortality to compare against the unstructured data model. The dataset contains a subset of the features used in the Lisbon Portugal dataset 18 (which is widely used in the LOS prediction literature) for all encounters in the NYU Binned LOS dataset. We selected a subset of 12 features that were available in the NYU Langone EHR: gender (categorical), age as measured by the difference in years between the birth date and the admission date (numerical), highest level of education (categorical), country (categorical), postal code as address (categorical), marital status (categorical), admission type (categorical), admission service type (categorical), provider ID (categorical), department specialty (categorical), procedure name (categorical) and number of previous admissions (numerical). We left out diagnosis because it is not always available at the time of writing H&P notes. We excluded the following three features owing to difficulty in finding them in the NYU Langone EHR: homogeneous group diagnosis code, great diagnostic category and treatment.
NYU Insurance Denial–Claim Forms
We created this structured dataset based on the NYU Insurance Denial dataset for comparison against the unstructured data model. The dataset contains structured features for all encounters in the NYU Insurance Denial dataset and has the same splits as the NYU Insurance Denial dataset. Selection of structured features was based on the features in ref. 19 , which built a model that predicts insurance claim denial from demographic and care-related features found in the claim form. We found eight available features in the NYU Langone EHR: patient name (categorical), age (numerical), gender (categorical), postal code as a generalization of address (categorical), insurance brand (categorical), first insurance plan name (categorical), provider ID (categorical) and provider type (categorical). We additionally added four features based on the clinician’s inputs: second insurance plan code (categorical), a binary flag for surgical cases (categorical), a binary flag for emergency department cases (categorical) and a binary flag for Medicare fee-for-service users (categorical). We left out six features in ref. 19 owing to difficulty in searching for them: the patient’s relationship to the insured person, network type, whether the claim was a resubmission, diagnosis pointer, charge of service and prior authorization number.
Preprocessing
Pretraining datasets (nyu notes, nyu notes–manhattan, nyu notes–brooklyn).
Using these datasets, we trained an uncased BERT wordpiece tokenizer with a vocabulary size of 50,000 tokens, a maximum sequence length of 512 tokens and special tokens [SEP], [PAD], [UNK], [MASK] and [CLS]. Because most of the clinical notes had more than 512 tokens, we split each long note into non-overlapping chunks that were under the maximum sequence length. Specifically, we split each note into sentences using natural language toolkit (nltk) 32 and tokenized each sentence. For sentences that were longer than 512 tokens, we truncated them. Next, for all tokenized sentences in the same note, we concatenated them into groups such that each group had exactly the maximum sequence length. We discarded any remaining group (with a length strictly less than the maximum) of a long note.
Fine-tuning datasets (NYU Readmission, NYU Readmission–Manhattan, NYU Readmission–Brooklyn, NYU Mortality, NYU Binned LOS, NYU Insurance Denial, NYU Binned Comorbidity)
Using the tokenizer trained with NYU Notes, we first tokenized the discharge note. We truncated notes that exceeded the maximum sequence length of 512 tokens. We leave it for the future to design a language model that efficiently reads longer clinical notes (see Extended Data Fig. 8b for the impact of note length on language model performance).
We first decompressed the tar.gz files into folders of xml files. We then converted the xml files to brat format. Next, we converted the brat files to bio files. Finally, we wrote a custom HuggingFace 33 data loader to convert the folder of bio files into a HuggingFace dataset. Our code for preprocessing is available at GitHub.
Deployment datasets
We first cleaned the notes by stripping out html artifacts. We then tokenized the discharge note using NYUTron’s tokenizer. We truncated notes that exceeded the maximum sequence length of 512 tokens.
Structured dataset (NYU Readmission–LACE, NYU Mortality–SAPS2 + APACHE2, NYU Binned LOS–Lisbon Portugal, NYU Insurance Denial–Claim Forms)
When there was a missing numerical feature (for example, the average heart rate was NaN), we filled in the feature as the average feature across the training set. For missing categorical features (for example, the admitting department was ‘unspecified’), we left them as category ‘none’.
Pretraining
We pretrained a 109 million-parameter BERT model using preprocessed NYU Notes and the MLM objective for 3 weeks (96 epochs) on 24 NVIDIA A100 GPUs distributed over three compute nodes until the validation loss started to plateau. The model has 12 hidden layers with dimension 768, with 12 attention heads per layer. We used a per-device training batch size of 64 and saved every 2,000 steps. We used the Zero Redundancy AdamW optimizer (an improvement over the Adam optimizer) with a constant learning rate of 5 × 10 −5 , FP16 mixed precision and stage 2 parallelization 34 , 35 , 36 .
Fine-tuning
Nyutron + discharge notes for readmission prediction.
We replaced the trained MLM classifier with a randomly initialized linear classifier after the last hidden layer of the pretrained BERT model. We fine-tuned the model end to end using the training set of the NYU Readmission dataset for ten epochs, evaluating the validation AUC every half epoch and stopping early with a patience of five. We used the following hyperparameters from manual tuning based on the validation AUC: a learning rate of 2 × 10 −5 , a weight decay of 0.01 and a per-device batch size of 4. We optimized the cross-entropy loss using the AdamW optimizer. While varying the size of the dataset ( N ∈ {10 2 , 10 3 , 10 4 , 10 5 , 3.92336 × 10 5 }), we fine-tuned the pretrained model using subsamples of the NYU Readmission dataset and evaluated their AUC on the temporal test set. For each size of subsample, we ran five experiments with distinct random seeds (0, 13, 24, 36, 42). For comparison, we looked at the median AUC and the standard deviation of the five experiments.
NYUTron + H&P notes for in-hospital mortality prediction
We replaced the trained MLM classifier with a randomly initialized linear classifier after the last hidden layer of the pretrained BERT model. We fine-tuned the model end to end using the training set of the NYU Mortality dataset for ten epochs, evaluating the validation AUC every half epoch and stopping early with a patience of 5. We used the following hyperparameters from manual tuning based on the validation AUC: a learning rate of 2 × 10 −5 , a weight decay of 0.01 and a per-device batch size of 4. We optimized the cross-entropy loss using the AdamW optimizer. Using the full dataset, we fine-tuned the pretrained model using subsamples of the NYU Mortality dataset and evaluated their AUC on the temporal test set. For each size of subsample, we ran five experiments with distinct random seeds (0, 13, 24, 36, 42). For comparison, we looked at the median AUC and the standard deviation of the five experiments.
NYUTron + H&P notes for binned comorbidity prediction
We replaced the trained MLM classifier with a randomly initialized linear classifier after the last hidden layer of the pretrained BERT model. We fine-tuned the model end to end using the training set of the NYU Binned Comorbidity dataset for ten epochs, evaluating the validation OVR AUC every half epoch and stopping early with a patience of 5. We used the following hyperparameters from manual tuning based on the validation OVR AUC: a learning rate of 2 × 10 −5 , a weight decay of 0.01 and a per-device batch size of 4. We optimized the cross-entropy loss using the AdamW optimizer. Using the full dataset, we fine-tuned the pretrained model with subsamples of the NYU Binned Comorbidity dataset and evaluated their OVR AUC on the temporal test set. For each size of subsample, we ran five experiments with distinct random seeds (0, 13, 24, 36, 42). For comparison, we looked at the median OVR AUC and the standard deviation of the five experiments.
NYUTron + H&P notes for binned LOS prediction
We replaced the trained MLM classifier with a randomly initialized linear classifier after the last hidden layer of the pretrained BERT model. We fine-tuned the model end to end using the training set of the NYU Binned LOS dataset for ten epochs, evaluating the validation AUC every half epoch and stopping early with a patience of 5. We used the following hyperparameters from manual tuning based on the validation OVR AUC: a learning rate of 2 × 10 −5 , a weight decay of 0.01 and a per-device batch size of 4. We optimized the cross-entropy loss using the AdamW optimizer. Using the full dataset, we fine-tuned the pretrained model with subsamples of the NYU Binned LOS dataset and evaluated their AUC on the temporal test set. For each size of subsample, we ran five experiments with distinct random seeds (0, 13, 24, 36, 42). For inference, we combined the last two classes, label 3 (90–99% quantile) and label 4 (>99% quantile) because label 4 was very sparse. For comparison, we looked at the median OVR AUC and the standard deviation of the five experiments.
NYUTron + H&P notes for insurance denial prediction
We replaced the trained MLM classifier with a randomly initialized linear classifier after the last hidden layer of the pretrained BERT model. We fine-tuned the model end to end using the training set of the NYU Insurance Denial dataset for ten epochs, evaluating the validation AUC every half epoch and stopping early with a patience of 5. We used the following hyperparameters from manual tuning based on the validation AUC: a learning rate of 2 × 10 −5 , a weight decay of 0.01 and a per-device batch size of 4. We optimized the cross-entropy loss using the AdamW optimizer. Using the full dataset, we fine-tuned the pretrained model using subsamples of the NYU Insurance Denial dataset and evaluated their AUC on the temporal test set. For each size of subsample, we ran five experiments with distinct random seeds (0, 13, 24, 36, 42). For comparison, we looked at the median AUC and the standard deviation of the five experiments.
NYUTron + clinical notes for NER
We performed the fine-tuning experiments as follows. For each LLM in Extended Data Table 2 , we initialized a HuggingFace token classification model with the LLM as the pretrained checkpoint. We fine-tuned the model using i2b2-2012 NER for ten epochs using the AdamW optimizer with a learning rate of 2 × 10 −5 , a weight decay of 0.01 and a batch size of 4, evaluating every 50 steps and stopping early on the basis of area under the receiver operating characteristic (AUROC) with a patience of 1. This took 20 to 40 min on one node of four NVIDIA 17-GB V100 GPUs. We performed fine-tuning five times with random seeds 0, 13, 24, 36 and 42 and recorded the average and standard deviation of the micro-averaged F1 score (excluding the label for non-entity, ‘O’).
NYUTron + MIMIC-III readmission
We performed the fine-tuning experiments as follows: For both NYUTron and BioClinicalBert, we initialized a HuggingFace token classification model with the LLM as the pretrained checkpoint. We fine-tuned the model using MIMIC-III Readmission for ten epoch using the AdamW optimizer with a learning rate of 2 × 10 −5 , a weight decay of 0.01 and a batch size of 16, evaluating every half epoch. We performed fine-tuning five times with random seeds 0, 13, 24, 36 and 42.
The fine-tuned model was converted to a high-performance format (Onnx or TensorRT) and loaded into our deployment platform, an NVIDIA Triton inference engine that interfaces with the NYU Langone EHR through the HLA7 Fast Health Interoperability Resources (FHIR) 37 interface. For our consideration of performance, security, reliability and interpretability, see Supplementary Information section 5 .
Our deployment platform consisted of a modified version of NVIDIA’s Triton Inference Server that we named NYUTriton (pronounced ‘nutrition’ because it is good for the health system). NVIDIA Triton supports GPU-, x86- and ARM CPU-based inferencing and several key features, including dynamic batching, concurrent execution, a highly flexible model specification interface, and the ability to support a wide range of deep learning frameworks and accelerated model formats for maximum throughput. We modified NVIDIA Triton to interface seamlessly with HuggingFace-formatted language models so as to provide a uniform and highly flexible crossover point between our development and production pipelines. Trained models were saved in a standard HuggingFace-style format and converted into Onnx and then TensorRT to obtain sub-millisecond-scale inference results. NYUTriton is hosted on a dedicated inference server that consists of an AMD Threadripper 3960X (24 cores, 3.8 GHz), two RTX 3090 GPUs and 128 GB of DDR5 system memory purchased from Lambda Labs.
Following the signing of discharge summaries in Epic, the HL7 FHIR interface connects with NYUTriton and sends a JavaScript Object Notation (JSON) payload consisting of the discharge summary and metadata specifying the underlying readmission model and sender. NYUTriton preprocesses the text, runs an inference job with the accelerated NYUTron readmission model and returns the model’s inference result to a secondary orchestration server, which writes the result to a database and generates an email to the signing physician.
Structured baselines
The structured baselines were (1) SAPS2/APACHE2 features + XGBoost for in-hospital mortality prediction, (2) LACE features + XGBoost for readmission prediction, (3) Lisbon Portugal features + XGBoost for binned LOS prediction and (4) claim form features + XGBoost for insurance denial prediction.
For all structured baselines, we used the xgboost library to train an extreme gradient-boosted tree classifier with a binary logistic loss (multiclass softmax loss for more than two classes). We used scikit-learn’s randomized search to search hyperparameters among minimum_child_weight from {1, 5, 10}, gamma from {0.5, 1, 1.5, 2, 5}, subsample from {0.6, 0.8, 1}, col_sample_bytree from {0.6, 0.8, 1.0}, max_depth from {3, 4, 5}, learning_rates from {0.001, 0.01, 0.1, 0.5} and n_estimators from {10, 100, 1000} for 100 iterations based on AUROC score (ovr-auroc score for multiple classes) from threefold cross-validation 38 . We ran each experiment five times with distinct random seeds (0, 13, 24, 36, 42). For mortality, binned comorbidity, binned LOS and insurance denial, we ran the experiment with the full dataset. For readmission, we trained the model using subsamples ( N ∈ {10 2 , 10 3 , 10 4 , 10 5 , 3.92336 × 10 5 }) of the NYU Readmission–LACE dataset.
We evaluated the five tasks (in-hospital mortality prediction, binned comorbidity index prediction, 30-day all-cause readmission prediction, binned LOS prediction and insurance denial prediction) with AUC for binary classes and OVR AUROC for multiple classes. AUROC is the area under the two-dimensional curve consisting of tuples of the form (TPR, FPR) resulting from different decision thresholds.
We additionally evaluated readmission prediction with the following metrics: TPR, FPR, precision, recall and F1 score, all of which have a range of [0, 1]. We evaluated NER using a micro-averaged NER F1 score. The NER F1 score is similar to the normal F1 score except that the non-entity label ‘O’ is excluded for calculation.
Baseline algorithms for retrospective study
We compared NYUTron against physicians. We worked with six physicians with different levels of seniority: three attending physicians and three residents. The physicians were asked to review discharge summaries and predict whether the described patient would come back to the hospital within 30 days.
We compared NYUTron against four other LLMs and two machine learning models. ‘random-init’ is a BERT-base uncased model with randomly initialized parameters. ‘web-wiki’ is a BERT-base uncased model that is pretrained using web text (from the BookCorpus dataset 39 ) and Wikipedia articles (from the English Wikipedia dataset 40 ). ‘web-wiki+bio’ is a BERT model pretrained using web text, Wikipedia articles, PubMed abstracts 41 and PubMed Central (PMC) full articles 42 . ‘web-wiki+bio+clinical’, or gatortron-og 43 , is a Megatron-BERT 44 model pretrained using web text, Wikipedia articles, PubMed abstracts, PMC full articles, MIMIC-III notes and de-identified clinical notes from University of Florida Health. ‘lace+xgb’ reads structured LACE features (from a traditional clinical prediction rule) with an extreme gradient-boosted tree model 14 . ‘tf-idf+xgb’ reads corpus-level bag-of-words features with an extreme gradient-boosted tree model. For detailed statistics and examples of the pretraining corpora, see Extended Data Table 2 and Extended Data Fig. 3 .
Comparison with physicians
We randomly sampled 20 discharge notes from the random test set and asked six doctors with different seniority to predict whether the patient would come back within 30 days. The six physicians included three attending neurosurgeons, two neurosurgery residents and one ICU resident.
We used REDCap to perform the survey and gave physicians unlimited time. The survey was structured as follows: for each case, we asked “Will this person be admitted within 30 days?”, followed by the discharge summary. The physician could choose to answer “yes” or “no”. If the patient came back within 30 days, we had three follow-up questions to assess the characteristics of the subsequent readmission. First, we asked “Is this readmission related to the prior discharge?”, followed by the H&P note of the subsequent readmission. The physician could answer “yes”, “no”, “partial” or “does not meet Medicare criteria for 30-day readmission”. The second follow-up question was “Is this readmission preventable?”, to which the physician could answer “yes”, “no” or “partial”. The third follow-up question, “Any comments?”, had a free-text response where the physician could explain why the readmission was partially related to the prior discharge or why the readmission was partially preventable.
To collect NYUTron’s predictions, we used the text classification pipeline from HuggingFace to perform inference on the 20 discharge notes. For each discharge note, the pipeline output a predicted probability for readmission. We converted this predicted probability to a binary label with a threshold of 0.07 (a predicted probability no less than 0.07 was converted to a positive label). We chose 0.07 as the decision boundary because it was the minimum threshold that gave us above 80% validation recall among the thresholds {0.01 × n : n ∈ {1, ..., 90} (the 80% criterion was chosen on the basis of clinical applicability). See Extended Data Fig. 8c for NYUTron’s calibration curve.
Comparison with other language models
Discharge notes + other llms for readmission prediction.
The dataset, hyperparameters, and evaluation and software libraries for fine-tuning other LLMs were the same as when fine-tuning NYUTron. The pretrained LLMs were constructed as follows: random-init is a BERT-base uncased model with reset parameters. web-wiki is a BERT-base uncased model. web-wiki+bio is a dmis-lab/biobert-base cased v1.2 model. web-wiki+bio+clinical was Gatortron-og downloaded from NVIDIA NGC and converted to a HuggingFace checkpoint using convert megatron bert checkpoint.
Clinical notes + other LLMs for NER
The dataset, hyperparameters, and evaluation and software libraries for fine-tuning other LLMs were the same as for fine-tuning NYUTron. The pretrained LLMs were the same as the baseline LLMs for predicting readmission from discharge notes.
Comparison with machine learning models
Lace features + xgboost for readmission prediction.
Using the NYU Readmission–LACE dataset, we used the xgboost library to train an extreme gradient-boosted tree classifier with binary logistic loss with hyperparameter search. We used scikit-learn’s randomized search to search among minimum_child_weight from {1, 5, 10}, gamma from {0.5, 1, 1.5, 2, 5}, subsample from {0.6, 0.8, 1}, col_sample_bytree from {0.6, 0.8, 1.0}, max_depth from {3, 4, 5}, learning_rates from {0.001, 0.01, 0.1, 0.5} and n_estimators from {10, 100, 1000} for 100 iterations on the basis of AUROC score on the validation set 37 . We trained the model using subsamples ( N ∈ {10 2 , 10 3 , 10 4 , 10 5 , 3.92336 × 10 5 }) of the NYU Readmission–LACE dataset and evaluated their AUROC on the temporal test set. For each size of subsample, we ran five experiments with distinct random seeds (0, 13, 24, 36, 42). For comparison, we looked at the median AUROC and the standard deviation of the five experiments.
XGBoost + TF-IDF for readmission prediction
We transformed the text from the NYU Readmission dataset into tf-idf (term frequency–inverse document frequency) embeddings and used an xgboost classifier with binary logistic loss to predict readmission. We used raytune 45 to search hyperparameters, including max_tf-idf features from {512, 5000}, max_depth from a quantized random integer of 3 to 16 with an interval of 4, learning_rate from a log uniform distribution from 10 −2 to 10 −1 , gamma from a quantized uniform distribution from 0 to 12 with an interval of 4, minimum_child_weight from a quantized uniform distribution from 0 to 8 with an interval of 4, reg lambda from a quantized uniform distribution from 0 to 10 with an interval of 2, colsample_bytree from a uniform distribution from 0.7 to 1, scale pos weight from a quantized uniform distribution from 0 to 50 with an interval of 10 and n_estimator from a quantized integer distribution from 50 to 300 with an interval of 50. We trained the model using subsamples ( N ∈ {10 2 , 10 3 , 10 4 , 10 5 , 3.92336 × 10 5 }) of the NYU Readmission dataset and evaluated their AUROC on the temporal test set. For each size of subsample, we ran five experiments with distinct random seeds (0, 13, 24, 36, 42). For comparison, we looked at the median AUROC and the standard deviation of the five experiments.
Comparison of multi-site pretraining and fine-tuning
We compared NYUTron with its four variants (pretrained and fine-tuned using data from different sites): (1) NYU Notes–Manhattan + NYU Readmission–Manhattan, (2) NYU Notes–Manhattan + NYU Readmission–Brooklyn, (3) NYU Notes–Brooklyn + NYU Readmission–Brooklyn and (4) NYU Notes–Brooklyn + NYU Readmission–Manhattan. The hyperparameters and evaluation and software libraries for fine-tuning NYUTron variants were the same as for fine-tuning NYUTron.
Analysis of prospective performance
On the basis of the temporal test performance in the retrospective study, we selected a fine-tuned model with a decision threshold of 0.07 for use in the prospective trial.
Comparison of mortality rate and LOS
To assess the condition of the readmitted patients who were correctly predicted ( n = 3,298), we compared their in-hospital mortality rate and length of hospitalization with that of patients who were admitted in the same period. We collected data on patients who were admitted from February to May 2022 ( n = 30,548) and compared their in-hospital mortality rate and LOS with that of the readmitted patients caught by NYUTron from January to April 2022. We used two-sided Welch’s t tests (with the null hypothesis that the two groups had the same average) to assess the statistical significance of our comparison 46 .
Assessing NYUTron’s clinical impacts with physician review
We performed a post hoc analysis of readmitted patients in the prospective cohort to better understand model performance in a real-world environment and in anticipation of creating targeted interventions based on model outputs. One hundred readmitted patients were sampled from the five largest departments at NYU Langone by patient volume: internal medicine, pediatrics, general surgery, obstetrics and gynaecology, and haematology and oncology. Each department contributed 20 cases, with 10 cases having the highest predicted probabilities in that department and 10 cases having the lowest predicted probabilities. All cases had their encounter IDs logged for their index discharge and readmission on a secure online platform. A standardized questionnaire was constructed for manual review asking whether the readmission was planned, whether the readmission met CMS criteria for a penalized 30-day readmission, whether the readmission was preventable, whether an adverse event occurred on readmission, whether any adverse events were preventable and whether the reviewing physicians had any comments on the case. A team of ten physicians from internal medicine and neurosurgery were randomly assigned cases to be reviewed in pairs, with any disagreement between the reviewers adjudicated by a third physician reviewer. To determine whether a readmission was preventable, the reviewer looked at the discharge note of the inference encounter and the H&P note of the readmission encounter.
Ethical approval
Our research was approved by the NYU Langone institutional review board as ‘s21-01189 NYUtron’, and the methods were carried out in accordance with the institutional review board’s relevant guidelines and regulations.
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.
Data availability
The clinical data used for the pretraining, fine-tuning, validation and test sets were collected from the NYU Langone Health System EHR maintained by the NYULH Datacore team. Text data were stripped of rich-text features and directly included in the dataset ‘as is’ and were augmented with structured features where noted. These data consist of the production medical records of NYU Langone and cannot be made publicly available. Researchers may obtain a limited de-identified dataset (or a test subset) from NYU Langone Health System by reasonable request and subject to local and national ethical approvals. We also used publicly available i2b2-2012 ( https://portal.dbmi.hms.harvard.edu/projects/n2c2-nlp/ ) and MIMIC-III ( https://physionet.org/content/mimiciii/1.4/ ) datasets.
Code availability
We used sql and Python 3.8.13 to collect data from the NYU Langone EHR. We used REDCap 12.4.31 to collect physician responses. This work used several open-source libraries, including HuggingFace Transformers 4.19.2, Datasets 2.2.2, Evaluate 0.1.1, wandb 0.12.17, matplotlib 3.5.2, seaborn 0.12.2, pandas 1.4.2, ray 2.0.0, sklearn 1.1.1, deepspeed 0.8.0+384f17b, NVIDIA Apex, XGBoost 1.6.1 and nltk 3.6.3. Our experimental framework involved the use of these libraries and, in some cases, modification of them. We will release code to replicate the pretraining, fine-tuning and testing of the models described in this paper at the time of publication (code for experiments available at https://github.com/nyuolab/NYUTron , preprocessing code for i2b2-2012 available at https://github.com/nyuolab/i2b2_2012_preprocessing ). We include detailed methods and implementation steps in the Methods and Supplementary Information to allow for independent replication.
Change history
04 july 2023.
In the version of this article initially published, the graphs in Fig. 2b and Fig. 2c were interchanged, while there were additional patient icons shown in Fig. 4b; the correct figure versions are now shown in the HTML and PDF versions of the article.
Roberts, M. et al. Common pitfalls and recommendations for using machine learning to detect and prognosticate for COVID-19 using chest radiographs and CT scans. Nat. Mach. Intel. 3 , 199–217 (2021).
Article Google Scholar
Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G. & King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17 , 195 (2019).
Article PubMed PubMed Central Google Scholar
Gaube, S. et al. Do as AI say: susceptibility in deployment of clinical decision-aids. NPJ Digit. Med. 4 , 31 (2021).
Devlin, J., Chang, M.-W., Lee, K. & Toutanova, K. BERT: pre-training of deep bidirectional transformers for language understanding. in Proc. 2019 NAACL: Human Language Technologies (eds Burstein, J., Doran, C. & Solorio, T.) 4171–4186 (Association for Computational Linguistics, 2019).
Brown, T. B. et al. Language models are few-shot learners. in Proc. NeurIPS (eds Wallach, H. et al.) 1877–1901 (Neural Information Processing Systems, 2020).
Gage, B. F. et al. Selecting patients with atrial fibrillation for anticoagulation: stroke risk stratification in patients taking aspirin. Circulation 110 , 2287–2292 (2004).
Article CAS PubMed Google Scholar
Child, C. G. & Turcotte, J. G. Surgery and portal hypertension. Major Prob. Clin. Surg. 1 , 1–85 (1964).
CAS Google Scholar
Pugh, R. N. H., Murray-Lyon, I. M., Dawson, J. L., Pietroni, M. C. & Williams, R. Transection of the oesophagus for bleeding oesophageal varices. Br. J. Surg. 60 , 646–649 (2005).
Wells, P. et al. Accuracy of clinical assessment of deep-vein thrombosis. Lancet 345 , 1326–1330 (1995).
Tomašev, N. et al. A clinically applicable approach to continuous prediction of future acute kidney injury. Nature 572 , 116–119 (2019).
Article ADS PubMed PubMed Central Google Scholar
Wu, N. et al. Deep neural networks improve radiologists’ performance in breast cancer screening. IEEE TMI 39 , 1184–1194 (2020).
Google Scholar
Liang, H. et al. Evaluation and accurate diagnoses of pediatric diseases using artificial intelligence. Nat. Med. 25 , 433–438 (2019).
Kaplan, J. et al. Scaling laws for neural language models. Preprint at https://doi.org/10.48550/arXiv.2001.08361 (2020).
Chen, T. & Guestrin, C. XGBoost: a scalable tree boosting system. in Proc. 2016 SIGKDD 785–794 (Association for Computing Machinery, 2016).
Le Gall, J.-R. A. New simplified acute physiology score (SAPS II) based on a European/North American multicenter study. J. Am. Med. Assoc. 270 , 2957–2963 (1993).
Knaus, W. A., Draper, E. A., Wagner, D. P. & Zimmerman, J. E. APACHE II: a severity of disease classification system. Crit. Care Med. 13 , 818–829 (1985).
Charlson, M. E., Pompei, P., Ales, K. L. & MacKenzie, C. R. A new method of classifying prognostic comorbidity in longitudinal studies: development and validation. J. Chron. Dis. 40 , 373–383 (1987).
Caetano, N., Laureano, R. M. S. & Cortez, P. A data-driven approach to predict hospital length of stay—a Portuguese case study. in Proc. 2014 ICEIS (eds Hammoudi, S., Maciaszek, L. & Cordeiro, J.) 407–414 (SCITEPRESS Digital Library, 2014).
Johnson, M., Albizri, A. & Harfouche, A. Responsible artificial intelligence in healthcare: predicting and preventing insurance claim denials for economic and social wellbeing. Inf. Syst. Front. https://doi.org/10.1007/s10796-021-10137-5 (2021).
van Walraven, C., Wong, J. & Forster, A. J. LACE+ index: extension of a validated index to predict early death or urgent readmission after hospital discharge using administrative data. Open Med. 6 , 80–90 (2012).
Center for Disease Control. What is C. diff ? https://www.cdc.gov/cdiff/what-is.html (2022).
Yang, G. et al. Language model classifier aligns better with physician word sensitivity than XGBoost on readmission prediction. Preprint at https://doi.org/10.48550/arXiv.2211.07047 (2022).
Perez, E., Kiela, D. & Cho, K. True few-shot learning with language models. in Proc. NeurIPS (eds Ranzato, M. et al.) 11054–11070 (Neural Information Processing Systems, 2021).
Singhal, K. et al. Large language models encode clinical knowledge. Preprint at https://doi.org/10.48550/arXiv.2212.13138 (2022).
Bolton, E. et al. PubMedGPT 2.7B. Technical report. Stanford University Center for Research on Foundation Models https://crfm.stanford.edu/2022/12/15/pubmedgpt.html (2022).
Hoffmann, J. et al. An empirical analysis of compute-optimal large language model training. in Proc. NeurIPS (eds Koyejo, S. et al.) 30016–30030 (Neural Information Processing Systems, 2022).
Charlson, M. Charlson comorbidity index (CCI). MD+CALC https://www.mdcalc.com/calc/3917/charlson-comorbidity-index-cci (2022).
Sun, W., Rumshisky, A., & Uzuner, O. Annotating temporal information in clinical narratives. J. Biomed. Inform. 46 , 5–12 (2013).
Johnson, A. E. W. et al. MIMIC-III, a freely accessible critical care database. Sci. Data 3 , 160035 (2016).
Article CAS PubMed PubMed Central Google Scholar
van Walraven, C. et al. Derivation and validation of an index to predict early death or unplanned readmission after discharge from hospital to the community. Can. Med. Assoc. J. 182 , 551–557 (2010).
Sundararajan, V. et al. New ICD-10 version of the Charlson comorbidity index predicted in-hospital mortality. J. Clin. Epidemiol. 57 , 1288–1294 (2004).
Article PubMed Google Scholar
Bird, S. & Loper, E. NLTK: The Natural Language Toolkit. in Proc. 2004 ACL Interactive Poster and Demonstration Sessions 214–217 (Association for Computational Linguistics, 2004).
Wolf, T. et al. Transformers: state-of-the-art natural language processing. in Proc. 2020 EMNLP (eds Webber, B., Cohn, T., He, Y. & Liu, Y.) 38–45 (Association for Computational Linguistics, 2020).
Rajbhandari, S., Rasley, J., Ruwase, O. & He, Y. ZeRO: memory optimizations. Toward training trillion parameter models. in Proc. Int. Conf. High Performance Computing, Networking, Storage and Analysis 1–16 (IEEE Press, 2020).
Loshchilov, I. & Hutter, F. Decoupled weight decay regularization. ICLR https://openreview.net/forum?id=Bkg6RiCqY7 (2019).
Kingma, D. P. & Ba, J. Adam: a method for stochastic optimization. ICLR https://arxiv.org/abs/1412.6980 (2017).
Ayaz, M., Pasha, M. F., Alzahrani, M. Y., Budiarto, R. & Stiawan, D. The Fast Health Interoperability Resources (FHIR) standard: systematic literature review of implementations, applications, challenges and opportunities. JMIR Med. Inform. 9 , 21929 (2021).
Pedregosa, F. et al. Scikit-Learn: machine learning in Python. J. Mach. Learn. Res. 12 , 2825–2830 (2011).
MathSciNet MATH Google Scholar
Zhu, Y. et al. Aligning books and movies: towards story-like visual explanations by watching movies and reading books. in Proc. 2015 ICCV (ed. O’Conner, L.) 19–27 (IEEE, 2015).
Wikimedia Foundation. Wikimedia downloads. https://dumps.wikimedia.org/ (2021).
NCBI Literature Resources. Download PubMed data. https://pubmed.ncbi.nlm.nih.gov/download/ (2022).
National Library of Medicine. PubMed Central: PMC article datasets. https://www.ncbi.nlm.nih.gov/pmc/tools/textmining/ (2022).
Yang, X. et al. A large language model for electronic health records. NPJ Digit. Med. 5 , 194 (2022).
Shoeybi, M. et al. Megatron-LM: training multi-billion parameter language models using model parallelism. Preprint at https://doi.org/10.48550/arXiv.1909.08053 (2020).
Liaw, R. et al. Tune: a research platform for distributed model selection and training. Preprint at https://doi.org/10.48550/arXiv.1807.05118 (2018).
Welch, B. L. The generalization of Student’s problem when several different population variances are involved. Biometrika 34 , 28–35 (1947).
MathSciNet CAS PubMed MATH Google Scholar
Download references
Acknowledgements
E.K.O. is supported by the National Cancer Institute’s Early Surgeon Scientist Program (3P30CA016087-41S1) and the W.M. Keck Foundation. We would like to acknowledge J. Golfinos, whose vision and support made this project possible. We also would like to acknowledge our collaborators M. Costantino and K. Yie from the NYU Langone High-Performance Computing (HPC) team; without their tireless assistance in building and maintaining our GPU cluster, none of this research would have been possible. We would also like to thank D. Bar-Sagi and N. Mherabi, whose support for this research has made everything possible. We would like to thank B. Guzman from the NYU Langone Predictive Analytics Unit and V.J. Major from the NYU Grossman School of Medicine for their help with learning the SQL data structures used as part of this work. We would like to thank Y.(R.) Pang for reviewing and editing the initial manuscript. We would like to thank X. Yang from University of Florida for helping us with preprocessing and evaluating the i2b2 dataset. We thank S. Ciprut for helping with the REDCap survey and research administration for our team. We thank C. Fernandez-Granda, J. Kempe, V. Dhar, N. Wu, M. Barot, A. Chen, K. Link and F. Kwon for their valuable discussions.
Author information
Authors and affiliations.
Department of Neurosurgery, NYU Langone Health, New York, NY, USA
Lavender Yao Jiang, Xujin Chris Liu, Mustafa Nasir-Moin, Howard Antony Riina, Ilya Laufer, Nora C. Kim, Cordelia Orillac, Zane Schnurman, Christopher Livia, Hannah Weiss, David Kurland, Sean Neifert, Yosef Dastagirzada, Douglas Kondziolka, Alexander T. M. Cheung, Grace Yang, Ming Cao & Eric Karl Oermann
Center for Data Science, New York University, New York, NY, USA
Lavender Yao Jiang, Grace Yang, Ming Cao, Kyunghyun Cho & Eric Karl Oermann
Electrical and Computer Engineering, Tandon School of Engineering, New York, NY, USA
Xujin Chris Liu
NVIDIA, Santa Clara, CA, USA
Nima Pour Nejatian, Anas Abidin, Mona Flores & Anthony B. Costa
Predictive Analytics Unit, NYU Langone Health, New York, NY, USA
Duo Wang & Yindalon Aphinyanaphongs
Department of Internal Medicine, NYU Langone Health, New York, NY, USA
Kevin Eaton, Paawan Punjabi & Madeline Miceli
Department of Population Health, NYU Langone Health, New York, NY, USA
Yindalon Aphinyanaphongs
Prescient Design, Genentech, New York, NY, USA
Kyunghyun Cho
Courant Institute of Mathematical Sciences, New York University, New York, NY, USA
Canadian Institute for Advanced Research, Toronto, Ontario, Canada
Department of Radiology, NYU Langone Health, New York, NY, USA
Eric Karl Oermann
You can also search for this author in PubMed Google Scholar
Contributions
E.K.O. conceptualized and supervised the project. L.Y.J. collected data (except the NYU Insurance Denial and MIMIC-III Readmission datasets) and performed experiments. L.Y.J. and X.C.L. prepared the figures. X.C.L., N.P.N., M.N.-M. and K.C. debugged and tested the model and the pretraining and fine-tuning software. E.K.O. designed the NYUTriton deployment platform, and E.K.O., A.A. and D.W. built the system and integrated it with the EHR. K.E., E.K.O., D.W. and Y.A. collected and processed the NYU Insurance Denial dataset. H.A.R., I.L., P.P., K.E., M.M., N.C.K., C.O., Z.S., C.L., H.W., D.K., S.N., Y.D., D.K. and A.T.M.C. participated in the human experiments, review of cases, and providing user feedback and testing. G.Y. and M.C. provided the scripts for tf-idf+xgb and built the MIMIC-III Readmission dataset. M.F., A.B.C., Y.A. and K.C. provided guidance and feedback throughout the project. L.Y.J., K.C. and E.K.O. wrote the initial draft. L.Y.J., E.K.O., K.C., M.N.-M., G.Y. and M.C. formatted the final submission. All authors edited and revised the manuscript.
Corresponding author
Correspondence to Eric Karl Oermann .
Ethics declarations
Competing interests.
E.K.O. reports consulting with Sofinnova and Google, income from Merck & Co. and Mirati Therapeutics, and equity in Artisight. N.P.N., M.F. and A.B.C. are employed by NVIDIA. D.K. reports consulting with Elekta. K.C. is employed by Prescient Design, a Genentech accelerator, a subsidiary of Roche. There are no other potential conflicts of interest. The work presented herein was performed exclusively within the NYU Langone Health System.
Peer review
Peer review information.
Nature thanks Ziad Obermeyer and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data figures and tables
Extended data fig. 1 difference between random test and temporal test..
a , AUC curve for the random test shows better performance than the temporal test. The random-test AUC is 84.13%, compared to the temporal-test AUC of 80.2%. The difference highlights the importance of creating a test set to reflect the problem setup. In the case of readmission prediction, the deployment set always comes from the future of the training set. Thus we use the temporal test AUC for model selection. b , Comparison of random-test AUC and temporal-test AUC as the number of training examples increases shows that temporal-testing is important to estimate deployment performance. Here we show that sampling a temporally split out dataset seems “harder” than a randomly sampled test dataset because all tested LLMs and lace+xgb perform worse on the temporal test (notes from the future) than the random test (notes from the same time as the training data). The colored lines on the left (random test AUCs) are generally higher than the colored lines on the right (temporal test AUCs). We conclude that this is an important distinction that temporally sampled held-out test sets give a more realistic estimate of model performance. Interestingly, the language models seem to be more sensitive to this phenomenon than the lace+xgb model.
Extended Data Fig. 2 Benchmarking NYUTron against a traditional NLP model and other language models on a different clinical prediction task (clinical concept extraction).
We observe a similar trend as readmission prediction: (a) shows that NYUTron has better performance than tf-idf under different data availability settings and (b) shows that clinically pretrained language models have better performance than non-clinically pretrained language models. This corroborates our findings that health-system scale language models are general purpose clinical prediction engines and that a domain match between pretraining and finetuning corpus contributes to task performance. a , Comparison of temporal test AUCs between NYUTron and a traditional NLP model (tf-idf+xgb). NYUTron has a higher median AUC than tf-idf+xgb for all tested number of finetuning examples. The black vertical line indicates standard deviation over 5 trials of different random seeds (0, 13, 24, 36, 42). b , Comparison of LLMs’ finetuning performances on the NER task. On the i2b2-2012 clinical concept extraction task, the LLMs that are pretrained with clinical corpora (NYUTron, web-wiki+bio+clinical) have a higher average f1 score than LLMs that are not pretrained with clinical corpora (web-wiki+bio, web-wiki, random-init). Specifically, NYUTron and web-wiki+bio+clinical perform better than the randomly initialized model (36.64% higher median seqeval f1 score) and non-clinically pretrained models (2.01%–3.48% higher median seqeval f1 score). Note that the height of each bar is the average f1 score and the half length of each black vertical line indicates the standard deviation over 5 trials of different random seeds (0, 13, 24, 36, 42).
Extended Data Fig. 3 Examples of pretraining corpora.
We include here some examples from the utilized pretraining corpora to help contextualize our work. Examples from three types of pretrain corpus: (1) web-wiki (online books from bookcorpus and encyclopedia articles from English Wikipedia), (2) bio (abstracts of academic papers from Pubmed Abstracts and full articles from Pubmed Central), and (3) clinical (NYU Notes, NYU Readmission from Langone EHR and clinical notes from University of Florida Health).
Extended Data Fig. 4 Comparison of NYUTron’s and BioClinicalBERT’s performance on MIMIC-III Readmission.
To test how much finetuning NYUTron needs to generalize to another health system, we finetune NYUTron and BioClinicalBERT (which has the same number of parameters and architecture as NYUTron, but pretrained on MIMIC notes, bookcorpus, pubmed and wikipedia articles) using different subsamples of MIMIC-III readmission dataset. The dataset contains 52,726 de-identified ICU discharge notes from Boston Beth Israel Hospital with 8:1:1 train-val-test split. At 100 samples, the AUC is similar. At 1000 samples, NYUTron has a 3.58% higher median AUC than BioClinicalBERT (57.22% v.s. 53.64%). At 10,000 samples, NYUTron has a 6.42% higher median AUC than BioClinicalBERT (65.56% v.s. 59.14%). Using the full dataset (42,180 samples), NYUTron has a 3.8% higher median AUC than BioClinicalBERT (67.04% v.s. 63.24%). Given that NYUTron was pretrained on identified all-department notes from NYU Langone and finetuned on de-identified ICU-specific notes from Beth-Israel, this result shows that NYUTron is able to generalize to a very different health environment through local finetuning. The height of the bar indicates the median performance of 5 experiments using distinct random seeds (0, 13, 24, 36, 42) and the error bar indicates the min-max range.
Extended Data Fig. 5 Bias analysis stratifying NYUTron’s performance by clinical departments and months.
a , A stratified analysis of NYUTron’s temporal test performance by clinical department and oncological subspecialty. NYUTron performs best in the Neurology Department (AUC 90.12%), and performs worst in the Internal Medicine Department (AUC 67.95% for non-oncology specialty and AUC 63.77% for oncology specialty), with a difference of about 20% AUC. This significant variance across clinical departments suggests that a more fine-grained analysis may lead to performance benefits. We annotate the number of examples (N) and the readmission rate (p) for each department. b , NYUTron’s performance displays minor fluctuations over months. We plot the average monthly test AUC of NYUTron from January 2013 to December 2021 to look for underlying monthly trends or cycles and to test the hypothesis that performance would be worst in July when new physicians start their training with a different writing style than physicians already in practice (dashed red line indicating the monthly AUC of July). The height of the bar indicates average monthly performance across the 9 years and the vertical bar indicates the standard deviation. We annotate the number of examples (N) and the readmission rate (p) for each month. We note that July has the second lowest monthly AUC and the highest variance. We speculate (and need more years of data to verify) that clinical notes written by new physicians are associated with the temporal shift across the months and the drop in performance in July. Average AUCs from the quarters January to March, April to June, and July to September are increasing, which may coincide with residents’ rotation schedule across different clinical departments. We leave further investigation of this cyclical performance to future work.
Extended Data Fig. 6 Bias analysis stratifying NYUTron’s performance by age groups and major racial groups.
As part of an analysis of model performance by two possible sources of bias, age and race, we perform stratified analyses of NYUTron’s performance. We annotate the number of examples (N) and the readmission rate (p) for each evaluation. a , We stratify the temporal test based on nine bins of ages (0 to 90 years with bins of 10 year intervals). NYUTron performs best for patients who are 10 to 40 years old, and has declining performance by decile over the age of 40 years with the worst performance in the 80–90 years of age group. We observe that this isn’t an effect of sample size, the single largest sample is age 80–90, but likely reflects complexity and comorbidity burdens being disproportionately higher with advanced age. b , To test for potential dependencies and bias by race, we first identify the five most frequent races in the dataset (White, Other Race, Black, Chinese, Indian), then stratify the evaluation results by race. NYUTron performs best on Chinese patients and worst on Black patients with a mild variation in AUC across both groups.
Extended Data Fig. 7 Detailed statistics of the comparison between language models and lace+xgb.
a , A box plot with individual data points. For each model, 5 experiments were run using random seeds 0, 13, 24, 36, 42. The centerline of the box plot indicates the median. The upper line of the box indicates the first quantile. The lower line of the plot indicates the last quantile. The whisker extends to 1.5 times the interquartile length and the diamonds indicate outliers. b , A bar plot that shows the mean and standard deviation. The height of the bar indicates the mean across 5 experiments and the length of the black vertical line indicates the standard deviation.
Extended Data Fig. 8 Additional information about readmission prediction.
a , Visualization of readmission data split timelines. We visualize the random split, temporal split, and deployment split on a timeline to indicate this decision for model evaluation. The random split starts from January 2013 and ends in May 2021 (inclusive), which is further split into a 80% train set, 10% validation set and a 10% test set. The temporal split (temporal test) starts from June 2021 and ends in December 2021, a time period from which no training samples were sampled from. The deployment data is necessarily sampled from the future as it is accrued prospectively as part of our single arm, non-interventional clinical trial. b , NYUTron’s performance increases with more complete input notes. To attempt to estimate performance as a function of sequence length we sampled a subset of “long notes” from the temporal test set. Each note in this subset has no less than 400 words, or approximately 512 tokens. We truncated these long notes to 100, 200, 300 and 400 words while keeping their readmission labels fixed in order to demonstrate the incremental gain in performance as we capture proportionally more information from each of these “long notes”. The dashed line is the AUC of all notes. This figure shows that processing more words from the possible input leads to a better evaluation performance and confirms that there is a clear potential for improving performance by increasing maximum sequence length. c , d NYUTron’s calibration curve for temporal test (c, number of evaluation examples is N = 53,916) and prospective deployment (d, number of evaluation examples is N = 29,286). As a reference, the orange line is the calibration curve of an ideally calibrated classifier. The blue line is NYUTron’s calibration curve. Currently we do not perform any additional calibration and choose the decision threshold based on the precision and recall on the temporal validation set. The predicted probability is normalized by the largest predicted probability. Overall the model is well calibrated to the 30-day readmission task.
Supplementary information
Supplementary information, reporting summary, rights and permissions.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .
Reprints and Permissions
About this article
Cite this article.
Jiang, L.Y., Liu, X.C., Nejatian, N.P. et al. Health system-scale language models are all-purpose prediction engines. Nature 619 , 357–362 (2023). https://doi.org/10.1038/s41586-023-06160-y
Download citation
Received : 14 October 2022
Accepted : 02 May 2023
Published : 07 June 2023
Issue Date : 13 July 2023
DOI : https://doi.org/10.1038/s41586-023-06160-y
Share this article
Anyone you share the following link with will be able to read this content:
Sorry, a shareable link is not currently available for this article.
Provided by the Springer Nature SharedIt content-sharing initiative
This article is cited by
Implementing quality management systems to close the ai translation gap and facilitate safe, ethical, and effective health ai solutions.
- Shauna M. Overgaard
- Megan G. Graham
- Nicoleta J. Economou-Zavlanos
npj Digital Medicine (2023)
Comparing ChatGPT and GPT-4 performance in USMLE soft skill assessments
Scientific Reports (2023)
A study of generative large language model for medical research and healthcare
Multiple stakeholders drive diverse interpretability requirements for machine learning in healthcare.
- Fergus Imrie
- Robert Davis
- Mihaela van der Schaar
Nature Machine Intelligence (2023)
By submitting a comment you agree to abide by our Terms and Community Guidelines . If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.
Quick links
- Explore articles by subject
- Guide to authors
- Editorial policies
Sign up for the Nature Briefing: Translational Research newsletter — top stories in biotechnology, drug discovery and pharma.

Featured Clinical Reviews
- Diagnosis and Management of Stable Angina: A Review JAMA Review May 4, 2021
- FDA Regulation and Approval of Medical Devices: 1976-2020 JAMA Special Communication August 3, 2021
Select Your Interests
Customize your JAMA Network experience by selecting one or more topics from the list below.
- Academic Medicine
- Acid Base, Electrolytes, Fluids
- Allergy and Clinical Immunology
- Anesthesiology
- Anticoagulation
- Art and Images in Psychiatry
- Assisted Reproduction
- Bleeding and Transfusion
- Caring for the Critically Ill Patient
- Challenges in Clinical Electrocardiography
- Climate and Health
- Clinical Challenge
- Clinical Decision Support
- Clinical Implications of Basic Neuroscience
- Clinical Pharmacy and Pharmacology
- Complementary and Alternative Medicine
- Consensus Statements
- Coronavirus (COVID-19)
- Critical Care Medicine
- Cultural Competency
- Dental Medicine
- Dermatology
- Diabetes and Endocrinology
- Diagnostic Test Interpretation
- Drug Development
- Electronic Health Records
- Emergency Medicine
- End of Life
- Environmental Health
- Equity, Diversity, and Inclusion
- Facial Plastic Surgery
- Gastroenterology and Hepatology
- Genetics and Genomics
- Genomics and Precision Health
- Global Health
- Guide to Statistics and Methods
- Hair Disorders
- Health Care Delivery Models
- Health Care Economics, Insurance, Payment
- Health Care Quality
- Health Care Reform
- Health Care Safety
- Health Care Workforce
- Health Disparities
- Health Inequities
- Health Informatics
- Health Policy
- History of Medicine
- Hypertension
- Images in Neurology
- Implementation Science
- Infectious Diseases
- Innovations in Health Care Delivery
- JAMA Infographic
- Law and Medicine
- Leading Change
- Less is More
- LGBTQIA Medicine
- Lifestyle Behaviors
- Medical Coding
- Medical Devices and Equipment
- Medical Education
- Medical Education and Training
- Medical Journals and Publishing
- Mobile Health and Telemedicine
- Narrative Medicine
- Neuroscience and Psychiatry
- Notable Notes
- Nutrition, Obesity, Exercise
- Obstetrics and Gynecology
- Occupational Health
- Ophthalmology
- Orthopedics
- Otolaryngology
- Pain Medicine
- Pathology and Laboratory Medicine
- Patient Care
- Patient Information
- Performance Improvement
- Performance Measures
- Perioperative Care and Consultation
- Pharmacoeconomics
- Pharmacoepidemiology
- Pharmacogenetics
- Pharmacy and Clinical Pharmacology
- Physical Medicine and Rehabilitation
- Physical Therapy
- Physician Leadership
- Population Health
- Professional Well-being
- Professionalism
- Psychiatry and Behavioral Health
- Public Health
- Pulmonary Medicine
- Regulatory Agencies
- Research, Methods, Statistics
- Resuscitation
- Rheumatology
- Risk Management
- Scientific Discovery and the Future of Medicine
- Shared Decision Making and Communication
- Sleep Medicine
- Sports Medicine
- Stem Cell Transplantation
- Substance Use and Addiction Medicine
- Surgical Innovation
- Surgical Pearls
- Teachable Moment
- Technology and Finance
- The Art of JAMA
- The Arts and Medicine
- The Rational Clinical Examination
- Tobacco and e-Cigarettes
- Translational Medicine
- Trauma and Injury
- Treatment Adherence
- Ultrasonography
- Users' Guide to the Medical Literature
- Vaccination
- Venous Thromboembolism
- Veterans Health
- Women's Health
- Workflow and Process
- Wound Care, Infection, Healing
- Download PDF
- Share X Facebook Email LinkedIn
- Permissions
From Positive to Negative to Positive Again—The Mystery of Why COVID-19 Rebounds in Some Patients Who Take Paxlovid
David Ho, MD, managed to avoid contracting COVID-19 for more than 2 years.
But SARS-CoV-2 finally got the best of the pioneering HIV researcher on an April trip to Paris for, of all things, a 2-day COVID-19 conference.
The irony is not lost on Ho, director of the Aaron Diamond AIDS Research Center at Columbia University. He figures he most likely became infected at a preconference dinner for a small group of attendees. They dined inside a restaurant, and the waitstaff weren’t wearing masks, Ho explained in an interview.
Shortly after he returned home, Ho started coughing. His throat hurt, his head ached, his nose was runny, and he felt even more fatigued than a healthy person should after a quick trip across the pond and back. He immediately assumed that this was no cold, and a rapid antigen test followed by a polymerase chain reaction (PCR) test confirmed that he indeed had COVID-19.
About 12 hours after his symptoms arose, Ho swallowed his first dose of Pfizer’s antiviral nirmatrelvir/ritonavir, better known as Paxlovid. By day 4, his symptoms had resolved and he tested negative for COVID-19. After testing negative again on day 5, he ended his isolation from his family but continued to test daily.
After 6 consecutive negative rapid antigen tests, plus a negative PCR test, Ho awoke feeling under the weather. “I tested myself immediately, and I was completely surprised that I was positive again,” Ho recalled. “The initial shock was, ‘Wow, this is positive. I’ve never seen this.’”
A PCR test confirmed the positive rapid antigen test, and it was “back to jail” for Ho. “If you’re positive, you have to assume you’re infectious to others,” he explained.
In recent weeks, similar cases have been reported in the medical literature and on social media, prompting the Health Alert Network of the US Centers for Disease Control and Prevention (CDC) to issue a health advisory on May 24. COVID-19 rebound in people who’ve taken nirmatrelvir/ritonavir appears to be mild and short-lived, resolving, on average, in 3 days without additional anti-COVID-19 treatment, according to the advisory.
“I would say the anecdotes are pretty consistent and pretty pronounced,” H. Clifford Lane, MD, deputy director for clinical research and special projects at the National Institute of Allergy and Infectious Diseases, said in a recent interview. “Is there something here? If there is, what is it, and what do we do about it?”
No one is suggesting that people stop using the drug. In boldface type, the CDC’s health advisory says the agency continues to recommend nirmatrelvir/ritonavir for early treatment of mild to moderate COVID-19 among people at high risk of progression to severe disease, the population eligible for the drug under its Emergency Use Authorization (EUA), granted by the US Food and Drug Administration (FDA) in December 2021.
However, the unexpected rebound phenomenon raises questions about how best to use this antiviral. “There are more questions than answers,” Myron Cohen, MD, director of the University of North Carolina Institute for Global Health & Infectious Diseases in Chapel Hill and a leader of the National Institutes of Health’s COVID-19 Prevention Network, noted in an interview.
Under its EUA, nirmatrelvir/ritonavir can be prescribed for mild to moderate COVID-19 in nonhospitalized patients aged 12 years or older who are at high risk of progression to severe disease due to age, obesity, cancer, or chronic diseases such as type 1 or type 2 diabetes. (High-risk patients who have mild to moderate COVID-19 but are hospitalized for other reasons are also eligible.)
A 3-pill dose of Paxlovid consists of 2 nirmatrelvir pills and 1 ritonavir pill, which has no activity against SARS-CoV-2 on its own. Nirmatrelvir is a protease inhibitor that blocks SARS-CoV-2 from replicating, while ritonavir boosts nirmatrelvir by slowing its metabolism in the liver. Ritonavir, which has been used to boost HIV protease inhibitors, also can slow the metabolism of an assortment of other drugs, increasing blood concentrations too much. In many cases, though, drug-drug interactions can be managed by temporarily withholding, adjusting the dose of, or using an alternative to the concomitant medication and by increasing monitoring for potential adverse reactions, according to the National Institutes of Health’s COVID-19 Treatment Guidelines, advice echoed in Infectious Diseases Society of America guidelines published May 6.
While the antiviral remdesivir (Veklury) also has been shown to be highly effective in decreasing the risk of hospitalization of people with mild to moderate COVID-19, patients must go to an infusion center on 3 consecutive days for treatment. Nirmatrelvir/ritonavir pills, on the other hand, can be picked up at the drugstore and taken at home.
Similarly, the antiviral molnupiravir (Lagevrio), which received an EUA in December 2021 for treating mild to moderate COVID-19 in high-risk adults aged 18 years or older, is taken as a pill for 5 days, starting within 5 days of symptom onset.
However, when compared with a placebo in the clinical trials supporting their EUAs, molnupiravir, a collaboration between Merck and Ridgeback Biotherapeutics, was not as effective as nirmatrelvir/ritonavir in keeping patients out of the hospital. Molnupiravir is authorized for use only by patients for whom alternative FDA-authorized COVID-19 treatments aren’t accessible or clinically appropriate. Also, while nirmatrelvir/ritonavir is authorized for use in children as young as 12 years old, molnupiravir isn’t authorized for use in children younger than 18 years because it may affect bone and cartilage growth. Molnupiravir, which stops SARS-CoV-2 from replicating but via a different pathway than nirmatrelvir/ritonavir, is not recommended for pregnant individuals because animal studies suggest it could cause fetal harm .
The US government has purchased 3.1 million courses of molnupiravir and 20 million courses of nirmatrelvir/ritonavir, to be delivered this year. The Office of the Assistant Secretary for Preparedness and Response, part of the US Department of Health and Human Services, maintains a web-based site locator for drugstores and other facilities that have received an order of nirmatrelvir/ritonavir or molnupiravir in the previous 2 months or reported their availability within the previous 2 weeks. In addition, a constantly updated website enables people to search specifically for SARS-CoV-2 treatments in their community.
Although nirmatrelvir/ritonavir protects against severe COVID-19 in symptomatic people who’ve tested positive for SARS-CoV-2 infection, it doesn’t prevent individuals from becoming positive and symptomatic, according to an April 29 Pfizer press release about the findings of a randomized, placebo-controlled clinical trial among adults who had been exposed to SARS-CoV-2 through a household contact.
A Different World
Rebound isn’t even mentioned in the article that in April reported results from the EPIC-HR (Evaluation of Protease Inhibition for COVID-19 in High-Risk Patients) trial, which was the basis of Paxlovid’s EUA. Omicron isn’t mentioned either because trial participants, none of whom were vaccinated, contracted COVID-19 before Omicron burst onto the scene.
A clinical trial among unvaccinated people that predates Omicron, which now accounts for 100% of circulating SARS-CoV-2 in the US, “holds little clinical relevance” today, Emory University School of Medicine and Grady Health System infectious disease specialist Carlos del Rio, MD, said in an interview. Even unvaccinated people today are different from unvaccinated people pre-Omicron because they’re more likely to have had at least 1 previous COVID-19 infection, Cohen pointed out.
A study posted online May 26 but not peer-reviewed is one of the first to explore real-world effectiveness of nirmatrelvir/ritonavir and molnupiravir in vaccinated as well as unvaccinated patients infected with Omicron, according to its authors.
Conducted in Hong Kong, the retrospective cohort study focused on nearly 1.1 million nonhospitalized patients territory wide with confirmed SARS-CoV-2 infection during the Omicron BA.2.2 wave between February 26 and May 3, 2022. Among them, 5257 took molnupiravir and 5663 took nirmatrelvir/ritonavir.
Both antivirals were associated with lower all-cause mortality risk—a 39% reduction for molnupiravir, 75% for nirmatrelvir/ritonavir—compared with no antiviral use. Both also were associated with a lower risk of in-hospital disease progression—36% for molnupiravir and 53% for nirmatrelvir/ritonavir—compared with no antiviral use. Nirmatrelvir/ritonavir was associated with a 31% lower risk of hospitalization, while the hospitalization risk in patients who took molnupiravir was comparable with that of patients who didn’t take an antiviral.
Neither drug was associated with as high a level of protection among the Hong Kong patients infected with Omicron as was seen in its clinical trial among unvaccinated patients infected by the Delta variant. (In its trial , molnupiravir reduced hospitalization risk by 30% compared with placebo, while nirmatrelvir/ritonavir reduced it by 88%.)
In the Hong Kong study, nirmatrelvir/ritonavir use was associated with greater and more consistent protection than molnupiravir use, and the protective effects of nirmatrelvir/ritonavir were similar regardless of vaccination status and age. However, the apparent superiority of nirmatrelvir/ritonavir to molnupiravir in the study could have been due in part to a higher proportion of patients older than 65 years and a lower proportion of fully vaccinated patients among those who received the latter drug, the authors noted.
Pfizer is currently enrolling an estimated 1980 adults in a trial comparing Paxlovid with placebo, similar to EPIC-HR. But in the newer trial, participants have only a standard risk, not a high risk, of progressing from mild or moderate COVID-19 to severe disease. Another major difference between EPIC-HR and EPIC-SR (standard risk) is that all the participants are likely to have been infected with Omicron, not Delta, given the timing of the trial. Anyone who received any COVID-19 vaccine within 12 months of screening is ineligible to enroll in EPIC-SR, which means participants could have received their primary vaccine doses but not boosters.
Is It the Drug, or Is It the Disease?
The CDC’s May 24 health advisory noted that “a brief return of symptoms may be part of the natural history of SARS-CoV-2…infection in some persons, independent of treatment with Paxlovid and regardless of vaccination status.”
The FDA is aware of cases in which patients treated with Paxlovid tested positive at least once after testing negative, John Farley, MD, MPH, director of the Office of Infectious Diseases at the FDA’s Center for Drug Evaluation and Research, noted in an early May update for health care professionals. An additional analysis of the EPIC-HR clinical trial data showed that about 1% to 2% of participants in both the treatment and placebo groups tested positive after testing negative, Farley wrote, “so it is unclear at this point that this is related to drug treatment.”
In an email, Pfizer spokeswoman Jerica Pitts echoed the CDC and FDA. “We believe the return of elevated detected nasal viral RNA [is] uncommon and not uniquely associated with treatment,” she wrote.
Ho, coauthor of a study posted May 23 about 10 people—he is the second case described in the report—who rebounded after taking nirmatrelvir/ritonavir, disagrees. When asked whether he thought the rebound could be part of the natural course of SARS-CoV-2 infection, he replied “absolutely not.”
As evidence, he and his coauthors pointed to the experience of the National Basketball Association (NBA), which tests personnel daily. From December 14, 2021, to March 1, 2022, a period during which Omicron was dominant, rebounds occurred only on the basketball court—not among the nearly 1000 NBA personnel who were diagnosed with COVID-19 but not treated with nirmatrelvir/ritonavir, according to Ho and his coauthors. Their study had not undergone peer review.
However, Johns Hopkins breast cancer specialist Tatiana Prowell, MD, recently tweeted that she’s heard from physicians who’ve had patients with COVID-19 rebound or test positive on rapid antigen tests for up to 3 weeks, even though they never took nirmatrelvir/ritonavir or any other treatment. Perhaps, she speculated , Omicron and its subvariants take longer to peak or to clear than earlier SARS-CoV-2 variants. (Prowell had recently tweeted about how a household member’s symptoms disappeared and rapid antigen test results turned negative after completion of a course of nirmatrelvir/ritonavir. A week later, though, the symptoms returned, as did positive rapid antigen test results.)
Still, “[y]ou just didn’t hear about many rebounds pre-Paxlovid,” Robert Wachter, MD, who tweeted in May about wife Katie Hafner’s rebound after taking Paxlovid, said in an interview. “You have to say that there’s something about Paxlovid and Omicron that predisposes you to this phenomenon.”
The fact that few EPIC-HR participants experienced a rebound “gives me less confidence in all of the findings,” said Wachter, chair of the University of California, San Francisco, Department of Medicine. If he tested positive—despite his wife’s COVID-19 bout, he hasn’t—Wachter said he’d be “much more on the borderline” trying to decide whether to take Paxlovid than he would have been before he started hearing about rebounds.
“Even if rebounds turn out to be mild and self-limited, they have consequences for people in terms of their ability to go back to work or to school,” Wachter pointed out.
No one yet knows how common rebound is among people who’ve taken Paxlovid. “You need some very objective evaluation of it,” Lane said.
Ho dismisses Pfizer’s contention that rebound is uncommon. He and his coauthors noted that 5 of the 10 relapses described in their report occurred within 2 families—2 in his family and 3 in another—suggesting it isn’t rare.
That’s concerning, Ho said, because it appears that people who experience a relapse can infect others. Among the 10 cases in the report he coauthored, viral load during the relapse was comparable to levels during the initial infection. During their relapse, 1 symptomatic and 1 presymptomatic patient transmitted SARS-CoV-2 to family members, Ho and his coauthors wrote.
Trying to Figure It Out
Ho likely is one of very few people who’ve relapsed after taking nirmatrelvir/ritonavir and then sequenced their own virus both the first and second time around. (At least 1 other leading virologist, Peter Hotez, MD, PhD, of the Baylor College of Medicine, has revealed that he also experienced a post-Paxlovid relapse.)
Both of Ho’s sequences were identical, ruling out a couple of possible explanations for his relapse, he said. It couldn’t have been due to a stroke of bad luck—a second SARS-CoV-2 infection just as he was getting over his first one. And it couldn’t have resulted from the virus becoming resistant to nirmatrelvir. If either were the case, the virus pair wouldn’t have been identical.
Scientists have proposed a few other possible explanations for rebounds after nirmatrelvir/ritonavir treatment. “Question number 1 in my mind is the timing. I think maybe we’re giving it too early,” del Rio said. Perhaps, Wachter speculated, “If you get started right away, maybe you suppress the virus [and] the immune system doesn’t rev up in the way it normally would.” He and others have also suggested that 5 days might not be a long enough treatment course. “All these theories are total handwaving,” Wachter acknowledged.
To answer the outstanding questions about relapses, “I’m not sure we need a clinical trial in the classical definition,” del Rio said. “We need post-approval data.” It’s being collected, he said, but the findings probably won’t be available for months. For now, there is no evidence that additional treatment with nirmatrelvir/ritonavir is needed when a rebound is suspected after a 5-day course, according to both the CDC’s advisory and FDA official Farley’s recent update.
Despite all the questions about the rebound phenomenon, “the biggest challenge we’re having with the drug is it’s not being used as frequently as it should,” del Rio said. “Primary care physicians are freaked out about the drug-drug interactions.” The people in whom COVID-19 is most likely to progress to a serious or even deadly infection are also the ones most likely to be taking multiple medications, he noted.
One thing is for sure, testing is more important than ever, given the availability of effective treatments for COVID-19, Cohen said. “I see a sea change in the management of respiratory infections.”
Nobody says, “Oh, I think I have a cold” anymore, he explained. “If we’re going to treat people with COVID, we need to know if they have COVID.”
Published Online: June 8, 2022. doi:10.1001/jama.2022.9925
Conflict of Interest Disclosures: None reported.
See More About
Rubin R. From Positive to Negative to Positive Again—The Mystery of Why COVID-19 Rebounds in Some Patients Who Take Paxlovid. JAMA. 2022;327(24):2380–2382. doi:10.1001/jama.2022.9925
Manage citations:
© 2023
Artificial Intelligence Resource Center
Cardiology in JAMA : Read the Latest
Browse and subscribe to JAMA Network podcasts!
Others Also Liked
- Register for email alerts with links to free full-text articles
- Access PDFs of free articles
- Manage your interests
- Save searches and receive search alerts

IMAGES
VIDEO
COMMENTS
Definition and Purpose of Abstracts An abstract is a short summary of your (published or unpublished) research paper, usually about a paragraph (c. 6-7 sentences, 150-250 words) long. A well-written abstract serves multiple purposes: an abstract lets readers get the gist or essence of your paper or article quickly, in order to decide whether to….
Abstract should be unstructured, i.e. should not contain sections or subheadings. Manuscript. ... Scientific Reports is committed to publishing technically sound research. Manuscripts submitted to ...
INTRODUCTION. This paper is the third in a series on manuscript writing skills, published in the Indian Journal of Psychiatry.Earlier articles offered suggestions on how to write a good case report,[] and how to read, write, or review a paper on randomized controlled trials.[2,3] The present paper examines how authors may write a good abstract when preparing their manuscript for a scientific ...
Include those words in your abstract. Even if the paper won't be published, this is a good habit to develop. All information in the abstract must be covered in the body of the paper. Don't put a fact in the abstract that isn't described in the report. Proof-read the abstract for typos, spelling mistakes, and punctuation errors.
Step 3: Post-publication. Our marketing and communications teams promote articles across multiple channels following publication. We have a pre-publicity policy for authors, and we also offer ...
An abstract is a short summary of a longer work (such as a thesis, dissertation or research paper). The abstract concisely reports the aims and outcomes of your research, so that readers know exactly what your paper is about. Although the structure may vary slightly depending on your discipline, your abstract should describe the purpose of your ...
How to Write a Scientific Abstract. Scientific publications are an important source of information and knowledge in Academics, Research and development. When articles are submitted for publication, the 1st part that comes across and causes an impact on the minds of the readers is the abstract. It is a concise summary of the paper and must ...
The abstract should be written for the audience of this journal: do not assume too much or too little background with the topic. Ensure that all of the information found in the abstract also can be found in the body of the paper. Ensure that the important information of the paper is found in the abstract. Avoid: using the first paragraph of the ...
An abstract of a scientific research paper will contain elements not found in an abstract of a literature article, and vice versa. However, all abstracts share several mandatory components, and there are also some optional parts that you can decide to include or not. When preparing to draft your abstract, keep the following key process elements ...
The abstract should clearly preview the paper's content, allowing the reader to decide if the information is relevant to them and whether they should read the whole report. Abstracts should contain keywords and phrases that allow for easy searching online. An IMRaD abstract is typically a single paragraph of 150-300 words.
Developing such a skill takes practice. Here is an exercise to help you develop this skill. Pick a scientific article in your field. Read the paper with the abstract covered. Then try to write an abstract based on your reading. Compare your abstract to the author's. Repeat until you feel confident.
A scientific report is a document that describes the process, progress, and or results of technical or scientific research or the state of a technical or scientific research problem. It might also include recommendations and conclusion of the research. ... Although the Abstract comes first in a report, it is best to write it last, after you ...
An abstract should be a stand-alone summary of a research project. 1 Although abstracts are most used to provide an overview of a research project, they may also be used to summarize an implementation project related to practice, policy, or education in nursing. The abstract may be a precursor to a scientific manuscript, chapter, thesis, or ...
Scientific Reports has a 2-year impact factor of 4.6 (2022), and is the 5th most-cited journal in the world, with more than 738,000 citations in 2022*. *2023 Journal Citation Reports® Science ...
An abstract summarizes, usually in one paragraph of 300 words or less, the major aspects of the entire paper in a prescribed sequence that includes: 1) the overall purpose of the study and the research problem(s) you investigated; 2) the basic design of the study; 3) major findings or trends found as a result of your analysis; and, 4) a brief summary of your interpretations and conclusions.
What this handout is about. This handout provides a general guide to writing reports about scientific research you've performed. In addition to describing the conventional rules about the format and content of a lab report, we'll also attempt to convey why these rules exist, so you'll get a clearer, more dependable idea of how to approach ...
Here's how you write an abstract for a research paper: 1. Provide Context to your research topic. The first one or two sentences create the setting and provide an introduction to the topic of your study. As a rule of thumb, every reader of the journal should understand this first part of your scientific abstract.
This is a typical informative abstract for empirical social science research. After the background statement, the author discusses the problem statement or research question, followed by the results and the conclusions. ... Writing an abstract. "Writing Report Abstracts," (n.d.). Purdue Online Writing Lab. https://owl.english.purdue.edu/owl ...
An abstract is a short summary describing the contents of a scientific paper. Abstracts are one paragraph long and contain approximately 200-250 words. A scientific abstract summarizes the ...
An abstract is a brief summary of a research article, thesis, review, conference proceeding, or any in-depth analysis of a particular subject and is often used to help the reader quickly ascertain the paper's purpose. When used, an abstract always appears at the beginning of a manuscript or typescript, acting as the point-of-entry for any given academic paper or patent application.
2. Read your research paper completely. Highlight or underline the important points and copy and paste them into a separate document. After you finish reading your paper, review your underlined material and select sentences that help explain the research topic, research question, methods, results, and conclusion.
Scientific Reports is a peer-reviewed open-access scientific mega journal published by Nature Portfolio, covering all areas of the natural sciences.The journal was established in 2011. The journal states that their aim is to assess solely the scientific validity of a submitted paper, rather than its perceived importance, significance, or impact.
Here are 5 steps on how to write a lab report abstract: 1. Introduce the topic: Start with a brief introduction to your topic. Introducing the topic is the first step in writing an abstract for a lab report. This can be done by providing a brief overview of the experiment.
Language model-based clinical prediction. Our language model-based approach has four steps: data collection, pretraining, fine-tuning and deployment. In the first step (Fig. 1a ), we collected a ...
Under its EUA, nirmatrelvir/ritonavir can be prescribed for mild to moderate COVID-19 in nonhospitalized patients aged 12 years or older who are at high risk of progression to severe disease due to age, obesity, cancer, or chronic diseases such as type 1 or type 2 diabetes. (High-risk patients who have mild to moderate COVID-19 but are hospitalized for other reasons are also eligible.)
Abstract Recent scholarship shows that informed traders increasingly disguise trades in economically linked securities such as exchange-traded funds (ETFs). Linking that work to longstanding literature on financial markets' reactions to military conflict, we document a significant spike in short selling in the principal Israeli-company ETF ...
Providing direct fossil evidence of diet and feeding behavior in young tyrannosaurids, here, we report on an articulated skeleton of a juvenile Gorgosaurus libratus from the Upper Cretaceous Dinosaur Park Formation (~75.3 Ma) of Alberta, Canada (see Supplementary Text), that preserves the remains of two small caenagnathid theropods (Oviraptorosauria) in its abdominal cavity (Fig. 1, A to C).