logo
Loading...

PdfQuery-using-LangChain with AstraDB - 生成式AI從No-Code到Low-Code開發與應用四部曲 - Cupoy

!pip install -q cassio datasets langchain openai==0.26.5 tiktoken Question Answering using Langchai...

!pip install -q cassio datasets langchain openai==0.26.5 tiktoken Question Answering using Langchain & AstraDB Overview: To get this demo running, you'll need to have a Serverless Cassandra database with Vector Search enabled, set up on Astra DB. Obtain a DB token with Database Administrator permissions, and grab your Database ID - you'll need these to connect. Link to Astra DB:-https://www.datastax.com/products/datastax-astra Additionally, an OpenAI API key is required for the natural language features. With those prerequisites met, you can then import the code dependencies, configure your credentials, initialize the LangChain vector indexer, and start the question-answering loop. This will retrieve relevant headlines from the database using vector search, and leverage the LLM to generate answers in natural language. The key steps are establishing the database connection, providing API keys, setting up LangChain, and implementing the QA loop to find pertinent passages and produce responses. # LangChain components to use from langchain.vectorstores.cassandra import Cassandra from langchain.indexes.vectorstore import VectorStoreIndexWrapper from langchain.llms import OpenAI from langchain.embeddings import OpenAIEmbeddings # Support for dataset retrieval with Hugging Face from datasets import load_dataset # With CassIO, the engine powering the Astra DB integration in LangChain, # you will also initialize the DB connection: import cassio 輸入API eky: ASTRA_DB_APPLICATION_TOKEN = "" ASTRA_DB_ID = "" # enter your astradb Database ID OPENAI_API_KEY = "" # enter your OpenAI api key !pip install pdfminer.six from pdfminer.high_level import extract_text text = extract_text("llama2.pdf") len(text) Connect with AstraDB database: cassio.init(token=ASTRA_DB_APPLICATION_TOKEN, database_id=ASTRA_DB_ID) LLM building & using OpenAI embeddings: llm = OpenAI(openai_api_key=OPENAI_API_KEY) embedding = OpenAIEmbeddings(openai_api_key=OPENAI_API_KEY) Creating the vector storage db astra_vector_store = Cassandra( embedding=embedding, table_name="kevin_mini_demo", session=None, keyspace=None, ) from langchain.text_splitter import CharacterTextSplitter # We need to split the text using Character Text Split such that it sshould not increse token size text_splitter = CharacterTextSplitter( separator = "\n", chunk_size = 1000, chunk_overlap = 200, length_function = len, ) texts = text_splitter.split_text(text) len(texts) texts[:105] Load the dataset into the vector store astra_vector_store.add_texts(texts[:108]) print("Inserted %i headlines." % len(texts[:108])) astra_vector_index = VectorStoreIndexWrapper(vectorstore=astra_vector_store) Question and Answer: first_question = True while True: if first_question: query_text = input("\nEnter your question (or type 'quit' to exit): ").strip() else: query_text = input("\nWhat's your next question (or type 'quit' to exit): ").strip() if query_text.lower() == "quit": break if query_text == "": continue first_question = False print("\nQUESTION: \"%s\"" % query_text) answer = astra_vector_index.query(query_text, llm=llm).strip() #reference link:https://python.langchain.com/docs/integrations/vectorstores/faiss?highlight=FAISS.from_documents#similarity-search-with-score print("ANSWER: \"%s\"\n" % answer) print("FIRST DOCUMENTS BY RELEVANCE:") for doc, score in astra_vector_store.similarity_search_with_score(query_text, k=4): print(" [%0.4f] \"%s ...\"" % (score, doc.page_content[:107])) output contents: ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0.0/55.5 kB ? eta -:--:-- ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 55.5/55.5 kB 1.3 MB/s eta 0:00:00 Installing build dependencies ... done Getting requirements to build wheel ... done Installing backend dependencies ... done Preparing metadata (pyproject.toml) ... done Building wheel for openai (pyproject.toml) ... done Collecting pdfminer.six Downloading pdfminer.six-20221105-py3-none-any.whl (5.6 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 5.6/5.6 MB 14.2 MB/s eta 0:00:00 Requirement already satisfied: charset-normalizer>=2.0.0 in /usr/local/lib/python3.10/dist-packages (from pdfminer.six) (3.3.2) Requirement already satisfied: cryptography>=36.0.0 in /usr/local/lib/python3.10/dist-packages (from pdfminer.six) (41.0.7) Requirement already satisfied: cffi>=1.12 in /usr/local/lib/python3.10/dist-packages (from cryptography>=36.0.0->pdfminer.six) (1.16.0) Requirement already satisfied: pycparser in /usr/local/lib/python3.10/dist-packages (from cffi>=1.12->cryptography>=36.0.0->pdfminer.six) (2.21) Installing collected packages: pdfminer.six Successfully installed pdfminer.six-20221105 86182 WARNING:cassandra.cluster:Downgrading core protocol version from 66 to 65 for a53f4497-bcbb-47b6-9123-1447ba638d3b-us-east1.db.astra.datastax.com:29042:3fc0355c-8dc0-4fde-b81f-b55692bc4856. To avoid this, it is best practice to explicitly set Cluster(protocol_version) to the version supported by your cluster. http://datastax.github.io/python-driver/api/cassandra/cluster.html#cassandra.cluster.Cluster.protocol_version WARNING:cassandra.cluster:Downgrading core protocol version from 65 to 5 for a53f4497-bcbb-47b6-9123-1447ba638d3b-us-east1.db.astra.datastax.com:29042:3fc0355c-8dc0-4fde-b81f-b55692bc4856. To avoid this, it is best practice to explicitly set Cluster(protocol_version) to the version supported by your cluster. http://datastax.github.io/python-driver/api/cassandra/cluster.html#cassandra.cluster.Cluster.protocol_version ERROR:cassandra.connection:Closing connection ###draft_code_symbol_lessthen###AsyncoreConnection(135702977708192) a53f4497-bcbb-47b6-9123-1447ba638d3b-us-east1.db.astra.datastax.com:29042:3fc0355c-8dc0-4fde-b81f-b55692bc4856> due to protocol error: Error from server: code=000a [Protocol error] message="Beta version of the protocol used (5/v5-beta), but USE_BETA flag is unset" WARNING:cassandra.cluster:Downgrading core protocol version from 5 to 4 for a53f4497-bcbb-47b6-9123-1447ba638d3b-us-east1.db.astra.datastax.com:29042:3fc0355c-8dc0-4fde-b81f-b55692bc4856. To avoid this, it is best practice to explicitly set Cluster(protocol_version) to the version supported by your cluster. http://datastax.github.io/python-driver/api/cassandra/cluster.html#cassandra.cluster.Cluster.protocol_version 107 ['STATE OF \nGLOBAL AIR/2019\nA SPECIAL REPORT ON GLOBAL EXPOSURE TO AIR POLLUTION \nAND ITS DISEASE BURDEN\n The State of Global Air is a collaboration between the Health Effects Institute \nand the Institute for Health Metrics and Evaluation’s Global Burden of Disease Project.\nThe State of Global Air is a collaboration between the \nInstitute for Health Metrics and Evaluation’s Global Burden of Disease Project \nCitation: Health Effects Institute. 2019. State of Global Air 2019. Special Report. Boston, MA:Health Effects Institute.\nand the Health Effects Institute.\nISSN 2578-6873 © 2019 Health Effects Institute\n\x0cWhat is the State of Global Air? \nThe State of Global Air report brings into one place the latest \ninformation on air quality and health for countries around the \nglobe. It is produced annually by the Health Effects Institute and \nthe Institute for Health Metrics and Evaluation’s Global Burden \nof Disease project as a source of objective, peer-reviewed air', 'globe. It is produced annually by the Health Effects Institute and \nthe Institute for Health Metrics and Evaluation’s Global Burden \nof Disease project as a source of objective, peer-reviewed air \nquality data and analysis. \nLike previous reports, this year’s publication presents infor-\nmation on outdoor and household air pollution and on the health \nimpacts of exposure to air pollution. For the first time, the report \nalso explores how air pollution affects life expectancy. \nWho is it for?\nThe report is designed to give citizens, journalists, policy mak-\ners, and scientists access to reliable, meaningful information \nabout air pollution exposure and its health effects. The report is \nfree and available to the public. \nHow can I explore the data?\nThis report has a companion interactive website that provides \ntools to explore, compare, and download data and graphics with \nthe latest air pollution levels and associated burden of disease.', 'This report has a companion interactive website that provides \ntools to explore, compare, and download data and graphics with \nthe latest air pollution levels and associated burden of disease. \nAnyone can use the website to access data for 195 individual \ncountries or territories and their related regions, as well as \ntrack trends from 1990 to 2017. Find it at stateofglobalair.org/data.\nWhere will I find information on:\nIntroduction ...................................................................page 1\nExposure to Air Pollution ...............................................page 3\nHousehold Air Pollution .................................................page 8\nThe Burden of Disease from Air Pollution ...................page 11\nAir Pollution’s Impact on Life Expectancy ....................page 16\nConclusions .................................................................page 19\nAdditional Resources ...................................................page 20', 'Conclusions .................................................................page 19\nAdditional Resources ...................................................page 20\nContributors and Funding ............................................page 22\n\x0cINTRODUCTION\nO ur health is strongly influenced by the air we breathe. \nPoor air quality causes people to die younger as a re-\nsult of cardiovascular and respiratory diseases, and also \nexacerbates chronic diseases such as asthma, causing \npeople to miss school or work and eroding quality of life. \nAir pollution affects the young and the old, the rich \nand the poor, and people in all areas of the globe. Research over \nthe past several decades has revealed a multitude of ways in which \npoor air quality affects our health and quality of life, and scientists \ncontinue to learn more. Studies have also continued to illuminate \nFigure 1. Global ranking of risk factors by total number of deaths from all causes for', 'continue to learn more. Studies have also continued to illuminate \nFigure 1. Global ranking of risk factors by total number of deaths from all causes for \nall ages and both sexes in 2017.\nAir pollution is the fifth leading risk factor for \nmortality worldwide. It is responsible for more \ndeaths than many better-known risk factors such as \nmalnutrition, alcohol use, and physical inactivity. \nEach year, more people die from air pollution–related \ndisease than from road traffic injuries or malaria. \nthe causes of air pollution, helping us understand why air quality is \nworse in some places and better in others. \nThis publication, the third annual State of Global Air report, \npresents the latest information on \nworldwide air pollution exposures \nand health impacts. It draws from \nthe most recent evidence produced \nas part of the Global Burden of Dis-\nease (GBD) project of the Institute for \nHealth Metrics and Evaluation (IHME)', 'the most recent evidence produced \nas part of the Global Burden of Dis-\nease (GBD) project of the Institute for \nHealth Metrics and Evaluation (IHME) \n(see textbox “Improving Global Bur-\nden of Disease Estimates with New \nand Better Data”). The State of Glob-\nal Air report is produced by the Health \nEffects Institute (HEI).\nBuilding on previous State of Glob-\nal Air reports, this publication offers \na global update on outdoor (ambient) \nair pollution and on household air \npollution from use of solid fuels for \ncooking. To track outdoor air quality, \nthe report focuses on the concentra-\ntions of two pollutants in particular: \nfine particle air pollution (particulate \nmatter measuring less than 2.5 mi-\ncrometers in aerodynamic diameter, \nor PM2.5) and ozone found near ground \nlevel (tropospheric ozone). This as-\nsessment also tracks exposure to \nhousehold air pollution from burning \nfuels such as coal, wood, or biomass', 'or PM2.5) and ozone found near ground \nlevel (tropospheric ozone). This as-\nsessment also tracks exposure to \nhousehold air pollution from burning \nfuels such as coal, wood, or biomass \nfor cooking. These forms of air pollu-\ntion are considered key indicators of \n 1 STATE O F G LO BAL A IR / 2 0 1 9\nExplore the rankings further at the IHME/GBD Compare site.\n\x0cair quality, and each contributes to the collective impact of air pollu-\ntion on human health. \nAir pollution consistently ranks among the top risk factors for \ndeath and disability worldwide. Breathing polluted air has long been \nrecognized as increasing a person’s chances of developing heart dis-\nease, chronic respiratory diseases, lung infections, and cancer. In \n2017, air pollution was the fifth highest mortality risk factor globally \nand was associated with about 4.9 million deaths and 147 million \nyears of healthy life lost (Figure 1). This report summarizes the latest', 'and was associated with about 4.9 million deaths and 147 million \nyears of healthy life lost (Figure 1). This report summarizes the latest \nevidence on the health impacts of air pollution and discusses how \nthese health impacts affect how long, and how well, people live. \nWHAT’S NEW THIS YEAR?\n• Assessing impacts on life expectancy. Life expectancy — a \nmeasure of how long a person can expect to live — has \nalways been an important indicator of the health of a society. \nThis year, the State of Global Air features an analysis of how \nmuch air pollution reduces life expectancy in countries around \nthe world. \n• Accounting for risks from type 2 diabetes. In light of recent \nevidence indicating that air pollution contributes to devel-\nopment of type 2 diabetes, this year’s assessment includes \nestimates of the related health burden. \nIMPROVING GLOBAL BURDEN OF DISEASE \nESTIMATES WITH NEW AND BETTER DATA: \nANNUAL UPDATES\nDespite some differences, the estimates of air', 'estimates of the related health burden. \nIMPROVING GLOBAL BURDEN OF DISEASE \nESTIMATES WITH NEW AND BETTER DATA: \nANNUAL UPDATES\nDespite some differences, the estimates of air \npollution health burden from multiple analyses \nconsistently show that air pollution has a large \nimpact on population health. \nAs the science continues to advance, the GBD project has incorpo-\nrated new data and methodology into its air pollution and health \nassessments. This year’s State of Global Air report presents updated \ninformation for all of the indicators addressed in previous reports.\nWhile new methodology may result in differences between as-\nsessments from previous years, trends over time are recalculated with \neach update to ensure the findings are internally consistent within \neach report. These updates help ensure that each assessment provides \nthe most accurate information available based on rigorous scientific \nmethods: \n• Eliminating double counting. This year’s report analyzes the', 'the most accurate information available based on rigorous scientific \nmethods: \n• Eliminating double counting. This year’s report analyzes the \nburden of disease from ambient air pollution independently \nfrom that of household air pollution. Past estimates had the \npotential for some double counting of the disease burden \nin populations exposed to both ambient and household air \npollution. \n• New methods for analyzing health impacts from exposures. The \nmathematical methods for analyzing how exposure to pollution \nrelates to specific health risks (known as exposure–response \nfunctions) have been updated. The new methods reflect data \nfrom recent epidemiological studies on the impacts of ambient \nPM2.5, household air pollution and secondhand smoke and from \nupdated literature reviews on the impacts of active smoking. \n• New methods for assessing ozone. The method for estimat-\ning ozone concentrations has been revised, incorporating for', 'updated literature reviews on the impacts of active smoking. \n• New methods for assessing ozone. The method for estimat-\ning ozone concentrations has been revised, incorporating for \nthe first time an extensive database of ground-level ozone \nmeasurements. In addition, the ozone exposure metric was \nchanged to an 8-hour daily maximum level to align with more \nrecent epidemiological analyses. \n• Inclusion of more PM2.5 measurements. The database of ground \nmeasurements of PM2.5 has been expanded from approximately \n6,000 to 9,960 sites. Including more measurements in the \nmodels used to calibrate satellite-based estimates results in \nfiner-grained estimates of PM2.5 concentrations that vary more \nsmoothly and realistically over space and time. In addition, es-\ntimates of PM2.5 exposure now directly incorporate uncertainty \ndistributions from the calibration model. \nOf these changes, those related to eliminating double counting and', 'timates of PM2.5 exposure now directly incorporate uncertainty \ndistributions from the calibration model. \nOf these changes, those related to eliminating double counting and \nthe updating of exposure–response functions have the largest impact \non the disease burden estimates. For more information about these \nchanges, please refer to the Additional Resources. \nOther groups have estimated the burden of air pollution on human \nhealth as concern over air pollution has grown. Most notably, the \nWorld Health Organization (WHO) has long made its own periodic \nestimates, with the most recent analysis (of 2016 data) released in \nearly 2018. IHME, the primary source of information for this report, is \nthe only organization that updates its estimates annually; its methods \nare increasingly being adopted by others, including the WHO. Given \nthe complexity of the process for developing these estimates, some \nvariation is not surprising. However, as the methods used by different', 'the complexity of the process for developing these estimates, some \nvariation is not surprising. However, as the methods used by different \norganizations converge, this variability is expected to diminish. \n 2 STATE OF GLO BAL AIR / 2 0 1 9\n\x0cEXPOSURE TO AIR POLLUTION \nT wo main pollutants are considered key indicators of \nambient, or outdoor, air quality: fine particle pollution \n— airborne particulate matter measuring less than \n2.5 micrometers in aerodynamic diameter, commonly \nreferred to as PM2.5 — and ground-level (tropospheric) \nozone. Analyses show that much of the world’s popula-\ntion lives in areas with unhealthy concentrations of these pollutants. \nThe latest data reveal encouraging improvements in some areas and \nworsening conditions in others. \nHousehold air pollution from the burning of solid fuels for cooking is \nan important source of exposure to particulate matter inside the home.', 'worsening conditions in others. \nHousehold air pollution from the burning of solid fuels for cooking is \nan important source of exposure to particulate matter inside the home. \nThis practice continues to be widespread in many regions of the world \nand can also be a substantial contributor to ambient pollution. \nFINE PARTICLE AIR POLLUTION\nFine particle air pollution comes from vehicle emissions, coal-burning \npower plants, industrial emissions, and many other human and natural \nsources. While exposures to larger airborne particles can also be harm-\nful, studies have shown that exposure to high average concentrations \nof PM2.5 over the course of several years is the most consistent and \nrobust predictor of mortality from cardiovascular, respiratory, and other \ntypes of diseases (see textbox “How PM2.5 Exposure Is Estimated”).\nMore than 90% of people worldwide live in areas \nexceeding the WHO Guideline for healthy air. More \nthan half live in areas that do not even meet WHO’s', 'More than 90% of people worldwide live in areas \nexceeding the WHO Guideline for healthy air. More \nthan half live in areas that do not even meet WHO’s \nleast-stringent air quality target. \nAround the world, ambient levels of PM2.5 continue to exceed \nthe Air Quality Guideline established by the WHO. The guideline \nfor annual average PM2.5 concentration is set at 10 µg/m3 based on \nevidence of the health effects of long-term exposure to PM2.5, but \nthe WHO acknowledged it could not rule out health effects below \nthat level. For regions of the world where air pollution is highest, \nthe WHO suggested three interim air quality targets set at pro-\ngressively lower concentrations: Interim Target 1 (IT-1, ≤35 µg/m3), \nInterim Target 2 (IT-2, ≤25 µg/m3), and Interim Target 3 (IT-3, ≤15 \nµg/m3). Figure 2 shows where these guidelines were still exceeded \nin 2017.\nIn 2017, 92% of the world’s population lived in areas that ex-', 'µg/m3). Figure 2 shows where these guidelines were still exceeded \nin 2017.\nIn 2017, 92% of the world’s population lived in areas that ex-\nceeded the WHO guideline for PM2.5. Fifty-four percent lived in areas \nexceeding IT-1, 67% lived in areas exceeding IT-2, and 82% lived in \nareas exceeding IT-3. \nFigure 2. Annual average PM2.5 concentrations in 2017 relative to the WHO Air Quality Guideline.\n 3 STATE OF GLO BAL AIR / 2 0 1 9\n\x0cHOW PM2.5 EXPOSURE IS ESTIMATED \nParticulate matter concentrations are measured in micrograms of \nparticulate matter per cubic meter of air, or µg/m3. Many of the \nworld’s more developed countries monitor PM2.5 concentrations \nthrough extensive networks of monitoring stations concentrated \naround urban areas. These stations provide continuous hourly mea-\nsurements of pollution levels, offering a rich source of data that has \nserved as the foundation for most studies of the potential health', 'surements of pollution levels, offering a rich source of data that has \nserved as the foundation for most studies of the potential health \neffects of air pollution and for management of air quality. \nWhile these data sources are valuable, on-the-ground air quality \nmonitoring stations are few and far between in the rapidly growing \nurban areas of countries at low and middle levels of development, \nas well as in rural and suburban areas throughout the world. To \nfill the gaps and provide a consistent view of air pollution levels \naround the world, scientists combine available ground measure-\nments with observations from satellites and information from \nglobal chemical transport models. \nUsing this combined approach, scientists systematically \nestimate annual average concentrations of PM2.5 across the entire \nglobe divided into blocks, or grid cells, each covering 0.1° × 0.1° of \nlongitude and latitude (approximately 11 km × 11 km at the equa-', 'globe divided into blocks, or grid cells, each covering 0.1° × 0.1° of \nlongitude and latitude (approximately 11 km × 11 km at the equa-\ntor). To estimate the annual average PM2.5 exposures for the popula-\ntion in a specific country, scientists combine the concentrations \nin each block with the number of people living within each block \nto produce a population-weighted annual average concentration. \nPopulation-weighted annual average concentrations are better esti-\nmates of population exposures, because they give greater weight to \nthe air pollution experienced where most people live. \nPATTERNS AND TRENDS IN PM2.5 EXPOSURE\nThe GBD project estimated population exposures to PM2.5 across the \nworld for the period 1990 to 2017. These assessments reveal a lot \nof regional variation in PM2.5 exposure and point to valuable insights \nabout the drivers behind high PM2.5 exposure and the impact of ef-\nforts to improve air quality. \nExposures to PM2.5 Vary Substantially Across Countries', 'about the drivers behind high PM2.5 exposure and the impact of ef-\nforts to improve air quality. \nExposures to PM2.5 Vary Substantially Across Countries \nand Regions \nExposures to PM2.5 show substantial variation both between and \nwithin regions of the world. In 2017, annual PM2.5 exposures were \nhighest in South Asia, where Nepal (100 µg/m3), India (91 µg/m3), \nBangladesh (61 µg/m3), and Pakistan (58 µg/m3) had the highest \nexposures. Bhutan’s exposure level (38 µg/m3) was the lowest in the \nregion but was still above WHO’s first interim target (IT-1). \nThe region with the second-highest PM2.5 exposures was west-\nern sub-Saharan Africa, where Niger (94 µg/m3), Cameroon (73 µg/\nm3), Nigeria (72 µg/m3), Chad (66 µg/m3), and Mauritania (47 µg/m3) \nhad the highest exposures. Countries in North Africa and the Middle \n 4 STATE OF GLO BAL AIR / 2 0 1 9\nEast experienced similarly high levels, for example, Qatar (91 µg/m3),', 'had the highest exposures. Countries in North Africa and the Middle \n 4 STATE OF GLO BAL AIR / 2 0 1 9\nEast experienced similarly high levels, for example, Qatar (91 µg/m3), \nSaudi Arabia (88 µg/m3), Egypt (87 µg/m3), Bahrain (71 µg/m3), Iraq \n(62 µg/m3), and Kuwait (61 µg/m3). All other countries in this region \nhad PM2.5 exposures between 30 and 60 µg/m3. In the region of East \nAsia, China had the highest PM2.5 exposures (53 µg/m3), while North \nKorea and Taiwan experienced concentrations of 32 and 23 µg/m3, \nrespectively. \nThe 10 countries with the lowest national PM2.5 exposure levels \nwere the Maldives, the United States, Norway, Estonia, Iceland, Can-\nada, Sweden, New Zealand, Brunei, and Finland. Population-weight-\ned PM2.5 concentrations averaged 8 µg/m3 or less in these countries. \nThe sources responsible for PM2.5 pollution vary within and be-\ntween countries and regions. Dust from the Sahara Desert contrib-', 'The sources responsible for PM2.5 pollution vary within and be-\ntween countries and regions. Dust from the Sahara Desert contrib-\nutes to the high particulate matter concentrations in North Africa \nand the Middle East, as well as to the high concentrations in some \ncountries in western sub-Saharan Africa. A recent analysis by HEI \nfound that major PM2.5 sources in India include household burning \nof solid fuels; dust from construction, roads, and other activities; \nindustrial and power plant burning of coal; brick production; trans-\nportation; and diesel-powered equipment. The relative importance of \nvarious sources of PM2.5 in China was quite different, with a separate \nstudy identifying the major sources as industrial and power plant \nburning of coal and other fuels; transportation; household burning of \nbiomass; open burning of agricultural fields; and household burning', 'burning of coal and other fuels; transportation; household burning of \nbiomass; open burning of agricultural fields; and household burning \nof coal for cooking and heating. Information on the HEI India and \nChina studies can be found in Additional Resources.\nThe mix and magnitude of the contribution of different sources \nare changing as some countries restrict activities or emissions to re-\nduce air pollution while others continue or increase their reliance on \ncoal and other major contributors to air pollution. Future editions of \nthe State of Global Air will feature the data on source contributions \nat national and global levels. \nExposures Are Stagnant in Some Places, Improving in \nOthers\nGlobally, the percentage of the world’s population living in areas that \nexceed the most-stringent WHO Air Quality Guideline (10 µg/m3 PM2.5) \ndecreased slightly, from 96% in 1990 to 92% in 2017. At the same time,', 'exceed the most-stringent WHO Air Quality Guideline (10 µg/m3 PM2.5) \ndecreased slightly, from 96% in 1990 to 92% in 2017. At the same time, \nthe percentage living in areas that fail to meet even the least-stringent \ntarget, IT-1 (35 µg/m3 PM2.5), remained steady at around 54%. \nChanges in air quality have been experienced unevenly across \ndifferent countries over the past several decades. Figure 3 shows the \npercentages of the populations living in areas exceeding the WHO \nguideline and each of the three interim targets for the 11 most pop-\nulous countries and the European Union in 1990, 2010, and 2017. \nThe left-most column in the figure shows that decreases in only \nhalf of the most populous countries have driven the slight global de-\ncrease in percentage of people living in areas exceeding the WHO \nguideline. The most striking decrease occurred in the United States, \nwhere the proportion of people living in areas exceeding the WHO', 'guideline. The most striking decrease occurred in the United States, \nwhere the proportion of people living in areas exceeding the WHO \n\x0cFigure 3. Percentage of population living in areas with PM2.5 concentrations exceeding the WHO Air Quality Guideline \nand interim targets in the 11 most populous countries and the European Union in 1990, 2010, and 2017.\nWHO Air Quality Guideline and Interim Targets for PM2.5 Levels \n Guideline > 10 µg/m3\nIT-3 > 15 µg/m3\n IT-2 > 25 µg/m3\nIT-1 > 35 µg/m3\nBangladesh\nPakistan\nNigeria\nIndia\nChina\nMexico\nIndonesia\nRussia\nEU\nJapan\nBrazil\nUSA\n1990\n2010\n2017\n0\n2 5\n5 0\n7 5\n100\n0\n2 5\n5 0\n7 5\n100\n0\nPopulation (%)\n2 5\n5 0\n7 5\n100\n0\n2 5\n5 0\n7 5\n100\nguideline plummeted from 50% in 1990 to about 40% in 2010 and to \njust 3% in 2017. In Brazil, after increasing slightly in 2010, the per-\ncentage of the population living in areas above the WHO guideline \ndeclined by nearly 23% to 68% in 2017. The EU and Japan both ex-', 'centage of the population living in areas above the WHO guideline \ndeclined by nearly 23% to 68% in 2017. The EU and Japan both ex-\nperienced 14% declines, mostly since 2010, but both still had about \n80% of their populations living in areas above the WHO guideline in \n2017. In the remaining countries the percentages of population living \nin areas above the guideline ranged from 92% in Russia to 100% in \nChina, India, Nigeria, Pakistan and Bangladesh.\nThe remaining three columns of Figure 3 show that progress since \n1990 toward meeting the three interim targets has also been mainly \nevident in the same set of countries — Brazil, Japan, the EU, Russia, \nIndonesia, and Mexico. These are also countries where the percent-\nages of population exceeding the least-stringent targets (IT-1 and \nIT-2) were comparatively low in 1990. \nHowever, in the remaining countries, most of which are in Asia, \nair quality has remained stubbornly poor. In Bangladesh and Paki-', ... '8587(17)30097-9.\nOrganization of Economic Cooperation and Development. 2014. The \nCost of Air Pollution: Health Impacts of Road Transport. Paris:OECD \nPublishing. Available: https://dx.doi.org/10.1787/9789264210448-en. \nWorld Bank. 2016. The Cost of Air Pollution: Strengthening \nthe Economic Case for Action. Washington, DC:World Bank \nGroup. Available: http://documents.worldbank.org/curated/\nen/781521473177013155/The-cost-of-air-pollution-strengthening-\nthe-economic-case-for-action [accessed 5 March 2019]. \nLIFE EXPECTANCY\nApte JS, Brauer M, Cohen AJ, Ezzati M, Pope CA. 2018. Ambient \nPM2.5 reduces global and regional life expectancy. Environ Sci \nTechnol Lett 5:546–551; https://doi.org/10.1021/acs.estlett.8b00360.\n 21 STATE O F G LOBAL AIR / 2 0 1 9\n\x0cCONTRIBUTORS AND FUNDING\nCONTRIBUTORS\nHealth Effects Institute: HEI is an independent global health and \nair pollution research institute. It is the primary developer of the', 'CONTRIBUTORS AND FUNDING\nCONTRIBUTORS\nHealth Effects Institute: HEI is an independent global health and \nair pollution research institute. It is the primary developer of the \nState of Global Air report and hosts and manages the website. HEI \nalso coordinates input from all other members of the team and fa-\ncilitates contact with media partners. Key HEI contributors include \nKaty Walker, principal scientist; Hilary Selby Polk, managing editor; \nAnnemoon van Erp, managing scientist; Pallavi Pant, staff scientist; \nKethural Manokaran and Kathryn Liziewski, research assistants; Aar-\non Cohen, consulting scientist at HEI and affiliate professor of global \nhealth at IHME; Bob O’Keefe, vice president; and Dan Greenbaum, \npresident.\nThe Institute for Health Metrics and Evaluation: IHME is an \nindependent population health research center at the University of \nWashington School of Medicine, Seattle. It provides the underlying', 'The Institute for Health Metrics and Evaluation: IHME is an \nindependent population health research center at the University of \nWashington School of Medicine, Seattle. It provides the underlying \nair pollution and health data and other critical support for this proj-\nect. Key IHME contributors include Jeffrey Stanaway, faculty; Ashley \nMarks, project officer; Kate Causey, post-bachelor fellow; William \nGodwin, post-bachelor fellow; and Dean Owen, senior manager for \nmarketing and communications.\nUniversity of British Columbia: Professor Michael Brauer of \nthe School of Population and Public Health at UBC serves as an ex-\npert adviser on this project. Dr. Brauer is a long-time principal col-\nlaborator on the air pollution assessment for the Global Burden of \nDisease project and led the effort to define the project’s global air \npollution exposure assessment methodology. \nUniversity of Texas at Austin: Assistant Professor Joshua Apte', 'Disease project and led the effort to define the project’s global air \npollution exposure assessment methodology. \nUniversity of Texas at Austin: Assistant Professor Joshua Apte \nof the Department of Civil, Architectural, and Environmental Engi-\nneering at the University of Texas at Austin conducts research on the \nassessment of air pollution exposures. Using data and methods from \nthe Global Burden of Disease project, Dr. Apte estimated the glob-\nal impacts of air pollution on life expectancy. These estimates have \nbeen incorporated in the State of Global Air this year.\nADDITIONAL ACKNOWLEDGMENTS\nZev Ross, Hollie Olmstead, and Chris Marx at ZevRoss Spatial Ana-\nlytics provided data visualization support and developed the interac-\ntive feature of the site. Glenn Ruga at Glenn Ruga/Visual Communi-\ncations designed the website; Ezra Klughaupt and Diane Szczesuil at \nCharles River Web developed the website; and Anne Frances John-', 'tive feature of the site. Glenn Ruga at Glenn Ruga/Visual Communi-\ncations designed the website; Ezra Klughaupt and Diane Szczesuil at \nCharles River Web developed the website; and Anne Frances John-\nson of Creative Science Writing provided additional writing support \nfor the report and website. \nFUNDING \nThe State of Global Air project is funded by Bloomberg Philanthropies \nand the William and Flora Hewlett Foundation.\nGlossary\nFor a glossary of terms, see the State of Global Air website.\nHow to Cite This Publication\nHealth Effects Institute. 2019. State of Global Air 2019. Special Report. \nBoston, MA:Health Effects Institute.\nHealth Effects Institute\n75 Federal Street, Suite 1400\nBoston, MA 02110, USA\n 22 STATE OF G LOBAL AIR / 2 0 1 9'] Output is truncated. View as a scrollable element or open in a text editor. Adjust cell output settings... Inserted 107 headlines. Enter your question (or type 'quit' to exit): what is there on the last page of this pdf? WARNING:cassandra.protocol:Server warning: Top-K queries can only be run with consistency level ONE / LOCAL_ONE / NODE_LOCAL. Consistency level LOCAL_QUORUM was requested. Downgrading the consistency level to LOCAL_ONE. QUESTION: "what is there on the last page of this pdf?" ANSWER: "I don't know, as I do not have access to the PDF." FIRST DOCUMENTS BY RELEVANCE: WARNING:cassandra.protocol:Server warning: Top-K queries can only be run with consistency level ONE / LOCAL_ONE / NODE_LOCAL. Consistency level LOCAL_QUORUM was requested. Downgrading the consistency level to LOCAL_ONE. [0.8739] "material to these libraries. Priority 2: Reaching the Last Mile 35. Prime Minister Vajpayee’s government ..." [0.8720] "levels and provide infrastructure for accessing the National Digital Library resources. 34. Additionally, ..." [0.8683] "3.“You will then…” (create 5 emails, make a video script, summarize, etc.) 4.“In a _____ tone/style…” (upbe ..." [0.8675] " Chemicals and Petrochemicals  Marine products  Lab Grown Diamonds  Precious Metals  Metals ..." What's your next question (or type 'quit' to exit): What is the address of "Health Effects Institute" WARNING:cassandra.protocol:Server warning: Top-K queries can only be run with consistency level ONE / LOCAL_ONE / NODE_LOCAL. Consistency level LOCAL_QUORUM was requested. Downgrading the consistency level to LOCAL_ONE. QUESTION: "What is the address of "Health Effects Institute"" WARNING:cassandra.protocol:Server warning: Top-K queries can only be run with consistency level ONE / LOCAL_ONE / NODE_LOCAL. Consistency level LOCAL_QUORUM was requested. Downgrading the consistency level to LOCAL_ONE. ANSWER: "I don't know, as the address is not mentioned in the given context." FIRST DOCUMENTS BY RELEVANCE: [0.9220] "CONTRIBUTORS AND FUNDING CONTRIBUTORS Health Effects Institute: HEI is an independent global health and ai ..." [0.9220] "CONTRIBUTORS AND FUNDING CONTRIBUTORS Health Effects Institute: HEI is an independent global health and ai ..." [0.9220] "CONTRIBUTORS AND FUNDING CONTRIBUTORS Health Effects Institute: HEI is an independent global health and ai ..." [0.9220] "CONTRIBUTORS AND FUNDING CONTRIBUTORS Health Effects Institute: HEI is an independent global health and ai ..." What's your next question (or type 'quit' to exit): What is PM2.5 WARNING:cassandra.protocol:Server warning: Top-K queries can only be run with consistency level ONE / LOCAL_ONE / NODE_LOCAL. Consistency level LOCAL_QUORUM was requested. Downgrading the consistency level to LOCAL_ONE. QUESTION: "What is PM2.5" WARNING:cassandra.protocol:Server warning: Top-K queries can only be run with consistency level ONE / LOCAL_ONE / NODE_LOCAL. Consistency level LOCAL_QUORUM was requested. Downgrading the consistency level to LOCAL_ONE. ANSWER: "PM2.5 is the measurement for particulate matter concentrations in micrograms of particulate matter per cubic meter of air. It is used to measure air pollution levels and is a major contributor to health issues." FIRST DOCUMENTS BY RELEVANCE: [0.9287] "µg/m3). Figure 2 shows where these guidelines were still exceeded in 2017. In 2017, 92% of the world’ ..." [0.9287] "µg/m3). Figure 2 shows where these guidelines were still exceeded in 2017. In 2017, 92% of the world’ ..." [0.9287] "µg/m3). Figure 2 shows where these guidelines were still exceeded in 2017. In 2017, 92% of the world’ ..." [0.9287] "µg/m3). Figure 2 shows where these guidelines were still exceeded in 2017. In 2017, 92% of the world’ ..." What's your next question (or type 'quit' to exit): What is full form of SDI WARNING:cassandra.protocol:Server warning: Top-K queries can only be run with consistency level ONE / LOCAL_ONE / NODE_LOCAL. Consistency level LOCAL_QUORUM was requested. Downgrading the consistency level to LOCAL_ONE. QUESTION: "What is full form of SDI" WARNING:cassandra.protocol:Server warning: Top-K queries can only be run with consistency level ONE / LOCAL_ONE / NODE_LOCAL. Consistency level LOCAL_QUORUM was requested. Downgrading the consistency level to LOCAL_ONE. ANSWER: "I don't know." FIRST DOCUMENTS BY RELEVANCE: [0.8752] "encouraged to leverage resources from the grants of the 15th Finance Commission, as well as existing schem ..." [0.8731] "covering 500 blocks for saturation of essential government services across multiple domains such as health ..." [0.8671] "each other and act as the ‘Saptarishi’ guiding us through the Amrit Kaal. 1) Inclusive Development 2) ..." [0.8665] "conducting interdisciplinary research, develop cutting-edge applications and scalable problem solutions in ..." What's your next question (or type 'quit' to exit): How many total people were exposed to household air pollution in 2017. WARNING:cassandra.protocol:Server warning: Top-K queries can only be run with consistency level ONE / LOCAL_ONE / NODE_LOCAL. Consistency level LOCAL_QUORUM was requested. Downgrading the consistency level to LOCAL_ONE. QUESTION: "How many total people were exposed to household air pollution in 2017." WARNING:cassandra.protocol:Server warning: Top-K queries can only be run with consistency level ONE / LOCAL_ONE / NODE_LOCAL. Consistency level LOCAL_QUORUM was requested. Downgrading the consistency level to LOCAL_ONE. ANSWER: "3.6 billion people were exposed to household air pollution in 2017." FIRST DOCUMENTS BY RELEVANCE: [0.9471] "fuels has declined. However, disparities persist, and populations in less-developed countries continue to ..." [0.9470] "fuels has declined. However, disparities persist, and populations in less-developed countries continue to ..." [0.9470] "fuels has declined. However, disparities persist, and populations in less-developed countries continue to ..." [0.9470] "fuels has declined. However, disparities persist, and populations in less-developed countries continue to ..." What's your next question (or type 'quit' to exit): quit