{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Uso de la API del Covid19 con Pandas" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "En esta práctica, vamos a usar la API [COVID-19](https://api.covid19api.com/countries), desarrollada por Kyle Redelinghuys y basada en [los datos](https://github.com/CSSEGISandData/COVID-19) que proporciona la [Universidad Johns Hopkins](https://systems.jhu.edu/research/public-health/ncov/)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Instalación librerías" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Para empezar, tenemos que **instalar la librería necesaria** para proyectar los datos que obtengamos sobre la incidencia de la COVID-19. Para ello, usamos el comando *pip install pandas*. Para especificar que lo que estamos haciendo es ejecutar un comando bash, ponemos un *!* al inicio de la línea." ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "collapsed": true }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Requirement already satisfied: pandas in /usr/local/lib/python3.8/dist-packages (1.3.1)\n", "Requirement already satisfied: pytz>=2017.3 in /usr/local/lib/python3.8/dist-packages (from pandas) (2020.4)\n", "Requirement already satisfied: python-dateutil>=2.7.3 in /usr/local/lib/python3.8/dist-packages (from pandas) (2.8.1)\n", "Requirement already satisfied: numpy>=1.17.3 in /usr/local/lib/python3.8/dist-packages (from pandas) (1.21.1)\n", "Requirement already satisfied: six>=1.5 in /usr/lib/python3/dist-packages (from python-dateutil>=2.7.3->pandas) (1.14.0)\n" ] } ], "source": [ "!pip install pandas" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Configuración de Pandas " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Una vez instalada, tenemos que **importarla al proyecto** que estamos realizando. Para ello, usamos la función *import*, e importamos la librería como *pd*, de forma abreviada, tal y como sigue a continuación." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "import pandas as pd" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Crear variable" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Ya está instalada la librería *Pandas* y la hemos importado. Ahora ya podemos empezar a crear las variables necesarias para, posteriormente, realizar los gráficos con los datos de la API. Lo primero será **definir la variable** *url*, que será la que le diga a Python el origen de los datos. " ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "url = 'https://api.covid19api.com/countries'" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Crear Dataframe " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Una vez tenemos el enlace donde se encuentran los datos, podemos empezar a mostrarlos. Lo primero será **crear un *dataframe***, una tabla con filas y columnas en la que aparecerán los datos. Para ello, usamos la función de *Pandas* que lee el formato .json, en el que están los datos de la API: *pd.read_json(url)*. Al poner (url), le estamos indicando a *Pandas* cuáles son los datos que tiene que leer." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "df = pd.read_json(url)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Ejecutamos el df** para visualizar que todo está correcto." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
CountrySlugISO2
0GambiagambiaGM
1ParaguayparaguayPY
2Trinidad and Tobagotrinidad-and-tobagoTT
3AustriaaustriaAT
4Sao Tome and Principesao-tome-and-principeST
............
243Cape Verdecape-verdeCV
244IsraelisraelIL
245BoliviaboliviaBO
246LatvialatviaLV
247Papua New Guineapapua-new-guineaPG
\n", "

248 rows × 3 columns

\n", "
" ], "text/plain": [ " Country Slug ISO2\n", "0 Gambia gambia GM\n", "1 Paraguay paraguay PY\n", "2 Trinidad and Tobago trinidad-and-tobago TT\n", "3 Austria austria AT\n", "4 Sao Tome and Principe sao-tome-and-principe ST\n", ".. ... ... ...\n", "243 Cape Verde cape-verde CV\n", "244 Israel israel IL\n", "245 Bolivia bolivia BO\n", "246 Latvia latvia LV\n", "247 Papua New Guinea papua-new-guinea PG\n", "\n", "[248 rows x 3 columns]" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Exploración tabla" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Como vemos, al componerse de 248 filas, solo se nos muestra un extracto, las primeras y las últimas filas. Ahora vamos a seguir explorando la tabla." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Por ejemplo, vamos a **visualizar las 6 primeras filas**. Para ello, a través de *head*, le decimos que nos muestre el dataframe con las X primeras filas, que en este caso le hemos indicado que sean 6." ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "scrolled": false }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
CountrySlugISO2
0GambiagambiaGM
1ParaguayparaguayPY
2Trinidad and Tobagotrinidad-and-tobagoTT
3AustriaaustriaAT
4Sao Tome and Principesao-tome-and-principeST
5Timor-Lestetimor-lesteTL
\n", "
" ], "text/plain": [ " Country Slug ISO2\n", "0 Gambia gambia GM\n", "1 Paraguay paraguay PY\n", "2 Trinidad and Tobago trinidad-and-tobago TT\n", "3 Austria austria AT\n", "4 Sao Tome and Principe sao-tome-and-principe ST\n", "5 Timor-Leste timor-leste TL" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.head(6)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Ahora lo mismo, pero al contrario, empezando por el final. Para decirle que queremos comenzar por el final, **en lugar de *head* usamos *tail***, con el mismo formato: entre paréntesis indicamos el X número de filas que queremos visualizar." ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
CountrySlugISO2
246LatvialatviaLV
247Papua New Guineapapua-new-guineaPG
\n", "
" ], "text/plain": [ " Country Slug ISO2\n", "246 Latvia latvia LV\n", "247 Papua New Guinea papua-new-guinea PG" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.tail(2)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Si queremos **conocer más datos técnicos** acerca de la composición de lo que estamos usando, podemos usar *info*, que nos indicará de qué tipo son los datos por columnas, el total de filas y columnas, cuánto utiliza de memoria o si existen datos nulos." ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "scrolled": true }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "RangeIndex: 248 entries, 0 to 247\n", "Data columns (total 3 columns):\n", " # Column Non-Null Count Dtype \n", "--- ------ -------------- ----- \n", " 0 Country 248 non-null object\n", " 1 Slug 248 non-null object\n", " 2 ISO2 248 non-null object\n", "dtypes: object(3)\n", "memory usage: 5.9+ KB\n" ] } ], "source": [ "df.info()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Si queremos filtrar por columnas para ver los datos de alguna en particular, seguimos usando el df pero ahora, entre corchetes, ponemos el nombre de la columna que queramos visualizar. Por ejemplo, para ver una lista (abreviada) de países, usamos *'Country'*." ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "scrolled": true }, "outputs": [ { "data": { "text/plain": [ "0 Gambia\n", "1 Paraguay\n", "2 Trinidad and Tobago\n", "3 Austria\n", "4 Sao Tome and Principe\n", " ... \n", "243 Cape Verde\n", "244 Israel\n", "245 Bolivia\n", "246 Latvia\n", "247 Papua New Guinea\n", "Name: Country, Length: 248, dtype: object" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df['Country']" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Además de esto, también podemos cruzar filas con columnas. Es decir, saber qué valor tiene la fila X en la columna Y. Para ello, añadimos otro corchete a los que tenemos especificando el número de la fila. En este caso, queremos saber qué país está codificado en la fila 66 de la tabla o *dataframe*." ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'Honduras'" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df['Country'][66]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Datos de España\n", "https://api.covid19api.com/country/spain/status/confirmed/live" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Una vez hemos testeado algunas formas de visualización del *dataframe* de los países, vamos a centrarnos en **algunos territorios concretos** para ver su evolución en tiempo real de los casos confirmados." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Para hacer esto, tenemos que **definir de nuevo el enlace desde el que *Pandas* leerá los datos y crear, a partir de este, el dataframe**. En esta ocasión, veremos los datos de España, por lo que obtenemos desde la API los casos confirmados en tiempo real para el país y los definimos. Posteriormente, los mostramos para ver que todo ha ido bien." ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "scrolled": true }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
CountryCountryCodeProvinceCityCityCodeLatLonCasesStatusDate
0SpainES40.46-3.750confirmed2020-01-22 00:00:00+00:00
1SpainES40.46-3.750confirmed2020-01-23 00:00:00+00:00
2SpainES40.46-3.750confirmed2020-01-24 00:00:00+00:00
3SpainES40.46-3.750confirmed2020-01-25 00:00:00+00:00
4SpainES40.46-3.750confirmed2020-01-26 00:00:00+00:00
.................................
816SpainES40.46-3.7511662214confirmed2022-04-17 00:00:00+00:00
817SpainES40.46-3.7511662214confirmed2022-04-18 00:00:00+00:00
818SpainES40.46-3.7511736893confirmed2022-04-19 00:00:00+00:00
819SpainES40.46-3.7511736893confirmed2022-04-20 00:00:00+00:00
820SpainES40.46-3.7511736893confirmed2022-04-21 00:00:00+00:00
\n", "

821 rows × 10 columns

\n", "
" ], "text/plain": [ " Country CountryCode Province City CityCode Lat Lon Cases \\\n", "0 Spain ES 40.46 -3.75 0 \n", "1 Spain ES 40.46 -3.75 0 \n", "2 Spain ES 40.46 -3.75 0 \n", "3 Spain ES 40.46 -3.75 0 \n", "4 Spain ES 40.46 -3.75 0 \n", ".. ... ... ... ... ... ... ... ... \n", "816 Spain ES 40.46 -3.75 11662214 \n", "817 Spain ES 40.46 -3.75 11662214 \n", "818 Spain ES 40.46 -3.75 11736893 \n", "819 Spain ES 40.46 -3.75 11736893 \n", "820 Spain ES 40.46 -3.75 11736893 \n", "\n", " Status Date \n", "0 confirmed 2020-01-22 00:00:00+00:00 \n", "1 confirmed 2020-01-23 00:00:00+00:00 \n", "2 confirmed 2020-01-24 00:00:00+00:00 \n", "3 confirmed 2020-01-25 00:00:00+00:00 \n", "4 confirmed 2020-01-26 00:00:00+00:00 \n", ".. ... ... \n", "816 confirmed 2022-04-17 00:00:00+00:00 \n", "817 confirmed 2022-04-18 00:00:00+00:00 \n", "818 confirmed 2022-04-19 00:00:00+00:00 \n", "819 confirmed 2022-04-20 00:00:00+00:00 \n", "820 confirmed 2022-04-21 00:00:00+00:00 \n", "\n", "[821 rows x 10 columns]" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "url_es = 'https://api.covid19api.com/country/spain/status/confirmed/live'\n", "df_es = pd.read_json(url_es)\n", "df_es" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Al igual que hicimos anteriormente, vamos a ver diferentes datos de la tabla. Para empezar, queremos listar las columnas existentes y saber qué datos contiene cada una. Lo hacemos **mostrando el *dataframe* de España y añadiendo *columns***. " ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "scrolled": true }, "outputs": [ { "data": { "text/plain": [ "Index(['Country', 'CountryCode', 'Province', 'City', 'CityCode', 'Lat', 'Lon',\n", " 'Cases', 'Status', 'Date'],\n", " dtype='object')" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_es.columns" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "De la misma manera, podemos cruzar filas con las columnas para saber un dato concreto. Por ejemplo, para saber cuál es el dato que se encuentra en la primera fila, empezando por el principio, filtramos por la columna 'Date', y con .head(x) le indicamos el número que queremos." ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "scrolled": true }, "outputs": [ { "data": { "text/plain": [ "0 2020-01-22 00:00:00+00:00\n", "Name: Date, dtype: datetime64[ns, UTC]" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_es['Date'].head(1)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Además, para saber los tipos de datos contenidos en el *dataframe*, si existen datos nulos y el tamaño total de la tabla, usamos *info*." ] }, { "cell_type": "code", "execution_count": 14, "metadata": { "scrolled": true }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "RangeIndex: 821 entries, 0 to 820\n", "Data columns (total 10 columns):\n", " # Column Non-Null Count Dtype \n", "--- ------ -------------- ----- \n", " 0 Country 821 non-null object \n", " 1 CountryCode 821 non-null object \n", " 2 Province 821 non-null object \n", " 3 City 821 non-null object \n", " 4 CityCode 821 non-null object \n", " 5 Lat 821 non-null float64 \n", " 6 Lon 821 non-null float64 \n", " 7 Cases 821 non-null int64 \n", " 8 Status 821 non-null object \n", " 9 Date 821 non-null datetime64[ns, UTC]\n", "dtypes: datetime64[ns, UTC](1), float64(2), int64(1), object(6)\n", "memory usage: 64.3+ KB\n" ] } ], "source": [ "df_es.info()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Si profundizamos en cuestiones estadísticas, podemos saber, por ejemplo, datos como la media, la desviación estándar, los diferentes cuartiles y el valor mínimo y máximo, usamos *describe*." ] }, { "cell_type": "code", "execution_count": 15, "metadata": { "scrolled": true }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
LatLonCases
count8.210000e+02821.008.210000e+02
mean4.046000e+01-3.753.426515e+06
std1.421952e-140.003.312558e+06
min4.046000e+01-3.750.000000e+00
25%4.046000e+01-3.753.428130e+05
50%4.046000e+01-3.753.180212e+06
75%4.046000e+01-3.754.953930e+06
max4.046000e+01-3.751.173689e+07
\n", "
" ], "text/plain": [ " Lat Lon Cases\n", "count 8.210000e+02 821.00 8.210000e+02\n", "mean 4.046000e+01 -3.75 3.426515e+06\n", "std 1.421952e-14 0.00 3.312558e+06\n", "min 4.046000e+01 -3.75 0.000000e+00\n", "25% 4.046000e+01 -3.75 3.428130e+05\n", "50% 4.046000e+01 -3.75 3.180212e+06\n", "75% 4.046000e+01 -3.75 4.953930e+06\n", "max 4.046000e+01 -3.75 1.173689e+07" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_es.describe()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Aquí volvemos a mostrar el *dataframe* entero, aunque aparece de forma abrevada por cuestión de espacio, sin filtrar por columna o tipo de datos." ] }, { "cell_type": "code", "execution_count": 16, "metadata": { "scrolled": true }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
CountryCountryCodeProvinceCityCityCodeLatLonCasesStatusDate
0SpainES40.46-3.750confirmed2020-01-22 00:00:00+00:00
1SpainES40.46-3.750confirmed2020-01-23 00:00:00+00:00
2SpainES40.46-3.750confirmed2020-01-24 00:00:00+00:00
3SpainES40.46-3.750confirmed2020-01-25 00:00:00+00:00
4SpainES40.46-3.750confirmed2020-01-26 00:00:00+00:00
.................................
816SpainES40.46-3.7511662214confirmed2022-04-17 00:00:00+00:00
817SpainES40.46-3.7511662214confirmed2022-04-18 00:00:00+00:00
818SpainES40.46-3.7511736893confirmed2022-04-19 00:00:00+00:00
819SpainES40.46-3.7511736893confirmed2022-04-20 00:00:00+00:00
820SpainES40.46-3.7511736893confirmed2022-04-21 00:00:00+00:00
\n", "

821 rows × 10 columns

\n", "
" ], "text/plain": [ " Country CountryCode Province City CityCode Lat Lon Cases \\\n", "0 Spain ES 40.46 -3.75 0 \n", "1 Spain ES 40.46 -3.75 0 \n", "2 Spain ES 40.46 -3.75 0 \n", "3 Spain ES 40.46 -3.75 0 \n", "4 Spain ES 40.46 -3.75 0 \n", ".. ... ... ... ... ... ... ... ... \n", "816 Spain ES 40.46 -3.75 11662214 \n", "817 Spain ES 40.46 -3.75 11662214 \n", "818 Spain ES 40.46 -3.75 11736893 \n", "819 Spain ES 40.46 -3.75 11736893 \n", "820 Spain ES 40.46 -3.75 11736893 \n", "\n", " Status Date \n", "0 confirmed 2020-01-22 00:00:00+00:00 \n", "1 confirmed 2020-01-23 00:00:00+00:00 \n", "2 confirmed 2020-01-24 00:00:00+00:00 \n", "3 confirmed 2020-01-25 00:00:00+00:00 \n", "4 confirmed 2020-01-26 00:00:00+00:00 \n", ".. ... ... \n", "816 confirmed 2022-04-17 00:00:00+00:00 \n", "817 confirmed 2022-04-18 00:00:00+00:00 \n", "818 confirmed 2022-04-19 00:00:00+00:00 \n", "819 confirmed 2022-04-20 00:00:00+00:00 \n", "820 confirmed 2022-04-21 00:00:00+00:00 \n", "\n", "[821 rows x 10 columns]" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_es" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "A continuación, colocamos como referencia la fecha de los datos, para empezar a obtener una imagen general de la evolución de los datos. Para ello, fijamos con *set_index* el nombre de la columna que queremos colocar como referencia, que en este caso es la fecha (*'Date'*)." ] }, { "cell_type": "code", "execution_count": 17, "metadata": { "scrolled": true }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
CountryCountryCodeProvinceCityCityCodeLatLonCasesStatus
Date
2020-01-22 00:00:00+00:00SpainES40.46-3.750confirmed
2020-01-23 00:00:00+00:00SpainES40.46-3.750confirmed
2020-01-24 00:00:00+00:00SpainES40.46-3.750confirmed
2020-01-25 00:00:00+00:00SpainES40.46-3.750confirmed
2020-01-26 00:00:00+00:00SpainES40.46-3.750confirmed
..............................
2022-04-17 00:00:00+00:00SpainES40.46-3.7511662214confirmed
2022-04-18 00:00:00+00:00SpainES40.46-3.7511662214confirmed
2022-04-19 00:00:00+00:00SpainES40.46-3.7511736893confirmed
2022-04-20 00:00:00+00:00SpainES40.46-3.7511736893confirmed
2022-04-21 00:00:00+00:00SpainES40.46-3.7511736893confirmed
\n", "

821 rows × 9 columns

\n", "
" ], "text/plain": [ " Country CountryCode Province City CityCode Lat \\\n", "Date \n", "2020-01-22 00:00:00+00:00 Spain ES 40.46 \n", "2020-01-23 00:00:00+00:00 Spain ES 40.46 \n", "2020-01-24 00:00:00+00:00 Spain ES 40.46 \n", "2020-01-25 00:00:00+00:00 Spain ES 40.46 \n", "2020-01-26 00:00:00+00:00 Spain ES 40.46 \n", "... ... ... ... ... ... ... \n", "2022-04-17 00:00:00+00:00 Spain ES 40.46 \n", "2022-04-18 00:00:00+00:00 Spain ES 40.46 \n", "2022-04-19 00:00:00+00:00 Spain ES 40.46 \n", "2022-04-20 00:00:00+00:00 Spain ES 40.46 \n", "2022-04-21 00:00:00+00:00 Spain ES 40.46 \n", "\n", " Lon Cases Status \n", "Date \n", "2020-01-22 00:00:00+00:00 -3.75 0 confirmed \n", "2020-01-23 00:00:00+00:00 -3.75 0 confirmed \n", "2020-01-24 00:00:00+00:00 -3.75 0 confirmed \n", "2020-01-25 00:00:00+00:00 -3.75 0 confirmed \n", "2020-01-26 00:00:00+00:00 -3.75 0 confirmed \n", "... ... ... ... \n", "2022-04-17 00:00:00+00:00 -3.75 11662214 confirmed \n", "2022-04-18 00:00:00+00:00 -3.75 11662214 confirmed \n", "2022-04-19 00:00:00+00:00 -3.75 11736893 confirmed \n", "2022-04-20 00:00:00+00:00 -3.75 11736893 confirmed \n", "2022-04-21 00:00:00+00:00 -3.75 11736893 confirmed \n", "\n", "[821 rows x 9 columns]" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_es.set_index('Date')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Para ceñirnos estrictamente a los datos que queremos ver, y así dejar fuera los que son innecesarios, como el país, la latitud y la longitud o el código del país, entre corchetes especificamos la(s) columna(s) que queramos mostrar además de la de la fecha." ] }, { "cell_type": "code", "execution_count": 18, "metadata": { "scrolled": true }, "outputs": [ { "data": { "text/plain": [ "Date\n", "2020-01-22 00:00:00+00:00 0\n", "2020-01-23 00:00:00+00:00 0\n", "2020-01-24 00:00:00+00:00 0\n", "2020-01-25 00:00:00+00:00 0\n", "2020-01-26 00:00:00+00:00 0\n", " ... \n", "2022-04-17 00:00:00+00:00 11662214\n", "2022-04-18 00:00:00+00:00 11662214\n", "2022-04-19 00:00:00+00:00 11736893\n", "2022-04-20 00:00:00+00:00 11736893\n", "2022-04-21 00:00:00+00:00 11736893\n", "Name: Cases, Length: 821, dtype: int64" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_es.set_index('Date')['Cases']" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Vamos a pasar a la representación gráfica. Manteniendo lo anterior, es decir, los datos que queremos visualizar (fecha y número de casos, para ver la evolución temporal), **añadimos *plot* y le ponemos un título al gráfico, dentro de los paréntesis a partir de *plot***." ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "df_es.set_index('Date')['Cases'].plot(title=\"Casos de Covid19 en España\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Datos de Italia" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Ahora repetimos el mismo proceso que hemos hecho hasta ahora pero con otro país, pero centrándonos solo en la representación visual de los datos.En este caso se ha elegido Italia." ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "url_it = 'https://api.covid19api.com/country/italy/status/confirmed/live'\n", "df_it = pd.read_json(url_it)\n", "df_it.set_index('Date')['Cases'].plot(title=\"Casos de Covid19 Ben Italia\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Datos de México" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "De nuevo, escogemos un tercer país de la API y volvemos a visualizarlo a través de *plot* de su *dataframe*. Se ha elegido México." ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 21, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "url_mex = 'https://api.covid19api.com/country/mexico/status/confirmed/live'\n", "df_mex = pd.read_json(url_mex)\n", "df_mex.set_index('Date')['Cases'].plot(title=\"Casos de Covid19 en México\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Comparación" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Una vez que hemos visto diversos países por separados, es el momento de unir todos sus datos para poder compararlos y ver cómo han ido evolucionando sus casos confirmados de coronavirus a lo largo de un periodo temporal." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Para ello, **creamos una nueva variable por cada país que contenga solo los datos que hemos ido necesitando y trabajando anteriormente: fecha y número de casos confirmados**. " ] }, { "cell_type": "code", "execution_count": 22, "metadata": {}, "outputs": [], "source": [ "casos_it = df_it.set_index('Date')['Cases']\n", "casos_es = df_es.set_index('Date')['Cases']\n", "casos_mex = df_mex.set_index('Date')['Cases']" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Una vez definidas estas variables, empezamos a jugar con ellas y mostrándolas, esta vez haciendo uso de la concatenación de la librería de *Pandas*." ] }, { "cell_type": "code", "execution_count": 23, "metadata": { "scrolled": true }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
CasesCasesCases
Date
2020-01-22 00:00:00+00:00000
2020-01-23 00:00:00+00:00000
2020-01-24 00:00:00+00:00000
2020-01-25 00:00:00+00:00000
2020-01-26 00:00:00+00:00000
............
2022-04-17 00:00:00+00:0015712088116622145727668
2022-04-18 00:00:00+00:0015730676116622145727668
2022-04-19 00:00:00+00:0015758002117368935729270
2022-04-20 00:00:00+00:0015858442117368935730560
2022-04-21 00:00:00+00:0015934437117368935731635
\n", "

821 rows × 3 columns

\n", "
" ], "text/plain": [ " Cases Cases Cases\n", "Date \n", "2020-01-22 00:00:00+00:00 0 0 0\n", "2020-01-23 00:00:00+00:00 0 0 0\n", "2020-01-24 00:00:00+00:00 0 0 0\n", "2020-01-25 00:00:00+00:00 0 0 0\n", "2020-01-26 00:00:00+00:00 0 0 0\n", "... ... ... ...\n", "2022-04-17 00:00:00+00:00 15712088 11662214 5727668\n", "2022-04-18 00:00:00+00:00 15730676 11662214 5727668\n", "2022-04-19 00:00:00+00:00 15758002 11736893 5729270\n", "2022-04-20 00:00:00+00:00 15858442 11736893 5730560\n", "2022-04-21 00:00:00+00:00 15934437 11736893 5731635\n", "\n", "[821 rows x 3 columns]" ] }, "execution_count": 23, "metadata": {}, "output_type": "execute_result" } ], "source": [ "pd.concat([casos_it, casos_es, casos_mex],axis=1)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Para simplificar esta agrupación, creamos una nueva variable a partir de la concatenación anterior, a la que llamaremos 'vs' y la mostramos." ] }, { "cell_type": "code", "execution_count": 24, "metadata": { "scrolled": false }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
CasesCasesCases
Date
2020-01-22 00:00:00+00:00000
2020-01-23 00:00:00+00:00000
2020-01-24 00:00:00+00:00000
2020-01-25 00:00:00+00:00000
2020-01-26 00:00:00+00:00000
............
2022-04-17 00:00:00+00:0015712088116622145727668
2022-04-18 00:00:00+00:0015730676116622145727668
2022-04-19 00:00:00+00:0015758002117368935729270
2022-04-20 00:00:00+00:0015858442117368935730560
2022-04-21 00:00:00+00:0015934437117368935731635
\n", "

821 rows × 3 columns

\n", "
" ], "text/plain": [ " Cases Cases Cases\n", "Date \n", "2020-01-22 00:00:00+00:00 0 0 0\n", "2020-01-23 00:00:00+00:00 0 0 0\n", "2020-01-24 00:00:00+00:00 0 0 0\n", "2020-01-25 00:00:00+00:00 0 0 0\n", "2020-01-26 00:00:00+00:00 0 0 0\n", "... ... ... ...\n", "2022-04-17 00:00:00+00:00 15712088 11662214 5727668\n", "2022-04-18 00:00:00+00:00 15730676 11662214 5727668\n", "2022-04-19 00:00:00+00:00 15758002 11736893 5729270\n", "2022-04-20 00:00:00+00:00 15858442 11736893 5730560\n", "2022-04-21 00:00:00+00:00 15934437 11736893 5731635\n", "\n", "[821 rows x 3 columns]" ] }, "execution_count": 24, "metadata": {}, "output_type": "execute_result" } ], "source": [ "vs = pd.concat([casos_it, casos_es, casos_mex],axis=1)\n", "vs" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Ahora, damos nombre a las columnas para organizar bien los datos y el gráfico. Una vez hecho esto, mostramos el gráfico con *plot* y elegimos el tipo de gráfico, que en este caso, será de área." ] }, { "cell_type": "code", "execution_count": 25, "metadata": { "scrolled": true }, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 25, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "vs.columns = ['Italia', 'España', 'Mexico']\n", "vs.plot(title=\"Italia vs España vs Mexico\",kind='area')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Expportamos nuestra nueva variable, que hemos limpiado para que contenga únicamente los datos necesarios, y la exportamos a un archivo separado por comas (.csv), con la función *to_csv*." ] }, { "cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [], "source": [ "vs.to_csv('esvsit.csv')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Verificamos que se haya guardado ejecutando el comando bash *ls*." ] }, { "cell_type": "code", "execution_count": 27, "metadata": { "scrolled": true }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "api-pandas-folium.ipynb pruebas-r.ipynb\t\t Shared_Resources\r\n", "esvsit.csv\t\t python-api-covid19-pandas.ipynb Untitled1.ipynb\r\n", "esvsit.png\t\t python-pruebas.ipynb\t\t Untitled.ipynb\r\n" ] } ], "source": [ "!ls" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Para guardar el gráfico como imagen PNG, tenemos que importar la librería *pyplot* de *matplotlib*, que ya se encuentra instalada. Generamos el gráfico de nuevo y con plt.savefig('nombre-grafico.png') se crea la imagen." ] }, { "cell_type": "code", "execution_count": 28, "metadata": { "scrolled": true }, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "import matplotlib.pyplot as plt\n", "vs.plot()\n", "plt.savefig('esvsit.png')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Volvemos a verificar que se ha exportado correctamente y que el gráfico existe como imagen PNG con el nombre que le hemos dado." ] }, { "cell_type": "code", "execution_count": 29, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "api-pandas-folium.ipynb pruebas-r.ipynb\t\t Shared_Resources\r\n", "esvsit.csv\t\t python-api-covid19-pandas.ipynb Untitled1.ipynb\r\n", "esvsit.png\t\t python-pruebas.ipynb\t\t Untitled.ipynb\r\n" ] } ], "source": [ "!ls" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.10" } }, "nbformat": 4, "nbformat_minor": 4 }