Show simple item record

dc.contributor.advisorProtopapas, Pavlos
dc.contributor.advisorWang, Hongming
dc.contributor.authorMarsh, Tanner
dc.date.accessioned2024-05-18T12:02:17Z
dash.embargo.terms2025-05-17
dc.date.created2024
dc.date.issued2024-05-17
dc.date.submitted2024
dc.identifier.citationMarsh, Tanner. 2024. Natural Language Search for NASA ADS. Master's thesis, Harvard University Division of Continuing Education.
dc.identifier.other31294003
dc.identifier.urihttps://nrs.harvard.edu/URN-3:HUL.INSTREPOS:37378607*
dc.description.abstractThe NASA Astrophysics Data System (ADS) is a critical resource for researchers and students in astronomy, astrophysics, and beyond. ADS indexes a vast collection of papers and scholarly literature that researchers can search through using the ADS website or API. ADS’s database is powered by Apache Solr, enabling users to formulate highly expressive and precise search queries from the more than 50 allowable search fields. However, the sophistication of ADS’s search capabilities comes at the cost of usability, necessitating users to familiarize themselves with Solr and ADS’s documentation to fully exploit its features. This thesis proposes a solution to enhance the accessibility of ADS by creating a chat application where users make requests for papers by asking for them in natural language rather than by constructing Solr queries. This application works by leveraging SOTA transformer-based large language models (LLMs) to translate natural language requests into Solr queries, thereby simplifying user interaction with the ADS database without compromising on the precision of search results. In this work, we use in-context learning (ICL) with retrieval augmented generation (RAG) in order to enhance the translation capabilities of the LLM, leading to significant improvement in translation performance.
dc.format.mimetypeapplication/pdf
dc.language.isoen
dash.licenseLAA
dc.subjectAstrophysics Data System (ADS)
dc.subjectfew-shot learning
dc.subjectin-context learning (ICL)
dc.subjectretrieval-augmented generation (RAG)
dc.subjectSolr
dc.subjecttext-to-sql
dc.subjectComputer science
dc.subjectArtificial intelligence
dc.subjectAstronomy
dc.titleNatural Language Search for NASA ADS
dc.typeThesis or Dissertation
dash.depositing.authorMarsh, Tanner
dash.embargo.until2025-05-17
dc.date.available2024-05-18T12:02:17Z
thesis.degree.date2024
thesis.degree.grantorHarvard University Division of Continuing Education
thesis.degree.levelMasters
thesis.degree.nameALM
dc.type.materialtext
thesis.degree.departmentExtension Studies
dash.author.emailtannerjmarsh@gmail.com


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record