{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Análisis de datos sobre el Covid con 'pandas'"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Tomamos el enlace de la base de datos que contiene información acerca de la incidencia del Covid. Esta api nos servirá para posteriormente visualizarse gracias a la importación de la librería 'pandas'. Así, la api podrá ser fácilmente entendible para quien tiene delante esos datos"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"La URL https://api.covid19api.com/countries"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Instalamos la librería 'pandas' porque el paquete no se encuentra dentro de las librerías de las que dispone Jupyterhub"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Requirement already satisfied: pandas in /usr/local/lib/python3.8/dist-packages (1.3.1)\n",
"Requirement already satisfied: pytz>=2017.3 in /usr/local/lib/python3.8/dist-packages (from pandas) (2020.4)\n",
"Requirement already satisfied: python-dateutil>=2.7.3 in /usr/local/lib/python3.8/dist-packages (from pandas) (2.8.1)\n",
"Requirement already satisfied: numpy>=1.17.3 in /usr/local/lib/python3.8/dist-packages (from pandas) (1.21.1)\n",
"Requirement already satisfied: six>=1.5 in /usr/lib/python3/dist-packages (from python-dateutil>=2.7.3->pandas) (1.14.0)\n"
]
}
],
"source": [
" !pip install pandas\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Primero importamos la librería para poder trabajar con ella y utilizar sus tablas de visualización de datos. A continuación, denominamos a pandas simplemente pd para que sea más fácil referenciar su nombre"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
"import pandas as pd"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Asignamos un valor a la URL para que cuando escribamos \"url\", Python entienda que nos referimos exactamente al contenido de ese enlace que contiene la api"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [],
"source": [
"url = 'https://api.covid19api.com/countries'"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Llamamos al URL que contiene el archivo json con la función de pandas read_json y después analizamos con nano qué delimitadores tiene este archuivo en concreto. Así se puede leer correctamente"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [],
"source": [
" df = pd.read_json(url)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Comprobamos el contenido del dataframe y nos aseguramos de que podemos visualizar lo que previamente hemos indicado"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"
\n",
" \n",
"
\n",
"
\n",
"
Country
\n",
"
Slug
\n",
"
ISO2
\n",
"
\n",
" \n",
" \n",
"
\n",
"
0
\n",
"
French Polynesia
\n",
"
french-polynesia
\n",
"
PF
\n",
"
\n",
"
\n",
"
1
\n",
"
Indonesia
\n",
"
indonesia
\n",
"
ID
\n",
"
\n",
"
\n",
"
2
\n",
"
Malta
\n",
"
malta
\n",
"
MT
\n",
"
\n",
"
\n",
"
3
\n",
"
Sao Tome and Principe
\n",
"
sao-tome-and-principe
\n",
"
ST
\n",
"
\n",
"
\n",
"
4
\n",
"
Haiti
\n",
"
haiti
\n",
"
HT
\n",
"
\n",
"
\n",
"
...
\n",
"
...
\n",
"
...
\n",
"
...
\n",
"
\n",
"
\n",
"
243
\n",
"
Myanmar
\n",
"
myanmar
\n",
"
MM
\n",
"
\n",
"
\n",
"
244
\n",
"
Turkey
\n",
"
turkey
\n",
"
TR
\n",
"
\n",
"
\n",
"
245
\n",
"
Western Sahara
\n",
"
western-sahara
\n",
"
EH
\n",
"
\n",
"
\n",
"
246
\n",
"
Pitcairn
\n",
"
pitcairn
\n",
"
PN
\n",
"
\n",
"
\n",
"
247
\n",
"
Tanzania, United Republic of
\n",
"
tanzania
\n",
"
TZ
\n",
"
\n",
" \n",
"
\n",
"
248 rows × 3 columns
\n",
"
"
],
"text/plain": [
" Country Slug ISO2\n",
"0 French Polynesia french-polynesia PF\n",
"1 Indonesia indonesia ID\n",
"2 Malta malta MT\n",
"3 Sao Tome and Principe sao-tome-and-principe ST\n",
"4 Haiti haiti HT\n",
".. ... ... ...\n",
"243 Myanmar myanmar MM\n",
"244 Turkey turkey TR\n",
"245 Western Sahara western-sahara EH\n",
"246 Pitcairn pitcairn PN\n",
"247 Tanzania, United Republic of tanzania TZ\n",
"\n",
"[248 rows x 3 columns]"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Mediante las funciones head y tail podemos visualizar la parte superior e inferior de los datos de la tabla respectivamente"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"
\n",
" \n",
"
\n",
"
\n",
"
Country
\n",
"
Slug
\n",
"
ISO2
\n",
"
\n",
" \n",
" \n",
"
\n",
"
0
\n",
"
French Polynesia
\n",
"
french-polynesia
\n",
"
PF
\n",
"
\n",
"
\n",
"
1
\n",
"
Indonesia
\n",
"
indonesia
\n",
"
ID
\n",
"
\n",
"
\n",
"
2
\n",
"
Malta
\n",
"
malta
\n",
"
MT
\n",
"
\n",
"
\n",
"
3
\n",
"
Sao Tome and Principe
\n",
"
sao-tome-and-principe
\n",
"
ST
\n",
"
\n",
"
\n",
"
4
\n",
"
Haiti
\n",
"
haiti
\n",
"
HT
\n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" Country Slug ISO2\n",
"0 French Polynesia french-polynesia PF\n",
"1 Indonesia indonesia ID\n",
"2 Malta malta MT\n",
"3 Sao Tome and Principe sao-tome-and-principe ST\n",
"4 Haiti haiti HT"
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.head()"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"
\n",
" \n",
"
\n",
"
\n",
"
Country
\n",
"
Slug
\n",
"
ISO2
\n",
"
\n",
" \n",
" \n",
"
\n",
"
243
\n",
"
Myanmar
\n",
"
myanmar
\n",
"
MM
\n",
"
\n",
"
\n",
"
244
\n",
"
Turkey
\n",
"
turkey
\n",
"
TR
\n",
"
\n",
"
\n",
"
245
\n",
"
Western Sahara
\n",
"
western-sahara
\n",
"
EH
\n",
"
\n",
"
\n",
"
246
\n",
"
Pitcairn
\n",
"
pitcairn
\n",
"
PN
\n",
"
\n",
"
\n",
"
247
\n",
"
Tanzania, United Republic of
\n",
"
tanzania
\n",
"
TZ
\n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" Country Slug ISO2\n",
"243 Myanmar myanmar MM\n",
"244 Turkey turkey TR\n",
"245 Western Sahara western-sahara EH\n",
"246 Pitcairn pitcairn PN\n",
"247 Tanzania, United Republic of tanzania TZ"
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.tail ()"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {},
"outputs": [
{
"ename": "SyntaxError",
"evalue": "invalid character in identifier (, line 1)",
"output_type": "error",
"traceback": [
"\u001b[0;36m File \u001b[0;32m\"\"\u001b[0;36m, line \u001b[0;32m1\u001b[0m\n\u001b[0;31m df[‘country’][66]\u001b[0m\n\u001b[0m ^\u001b[0m\n\u001b[0;31mSyntaxError\u001b[0m\u001b[0;31m:\u001b[0m invalid character in identifier\n"
]
}
],
"source": [
"df['Country'][66]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Ahora solamente buscamos los datos relativos a España, por lo que delimitamos dicha búsqueda: Reasignamos un valor a la url y marcamos que la dataframe que necesitamos tomar es la de España"
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"
\n",
" \n",
"
\n",
"
\n",
"
Country
\n",
"
CountryCode
\n",
"
Province
\n",
"
City
\n",
"
CityCode
\n",
"
Lat
\n",
"
Lon
\n",
"
Cases
\n",
"
Status
\n",
"
Date
\n",
"
\n",
" \n",
" \n",
"
\n",
"
0
\n",
"
Spain
\n",
"
ES
\n",
"
\n",
"
\n",
"
\n",
"
40.46
\n",
"
-3.75
\n",
"
0
\n",
"
confirmed
\n",
"
2020-01-22 00:00:00+00:00
\n",
"
\n",
"
\n",
"
1
\n",
"
Spain
\n",
"
ES
\n",
"
\n",
"
\n",
"
\n",
"
40.46
\n",
"
-3.75
\n",
"
0
\n",
"
confirmed
\n",
"
2020-01-23 00:00:00+00:00
\n",
"
\n",
"
\n",
"
2
\n",
"
Spain
\n",
"
ES
\n",
"
\n",
"
\n",
"
\n",
"
40.46
\n",
"
-3.75
\n",
"
0
\n",
"
confirmed
\n",
"
2020-01-24 00:00:00+00:00
\n",
"
\n",
"
\n",
"
3
\n",
"
Spain
\n",
"
ES
\n",
"
\n",
"
\n",
"
\n",
"
40.46
\n",
"
-3.75
\n",
"
0
\n",
"
confirmed
\n",
"
2020-01-25 00:00:00+00:00
\n",
"
\n",
"
\n",
"
4
\n",
"
Spain
\n",
"
ES
\n",
"
\n",
"
\n",
"
\n",
"
40.46
\n",
"
-3.75
\n",
"
0
\n",
"
confirmed
\n",
"
2020-01-26 00:00:00+00:00
\n",
"
\n",
"
\n",
"
...
\n",
"
...
\n",
"
...
\n",
"
...
\n",
"
...
\n",
"
...
\n",
"
...
\n",
"
...
\n",
"
...
\n",
"
...
\n",
"
...
\n",
"
\n",
"
\n",
"
798
\n",
"
Spain
\n",
"
ES
\n",
"
\n",
"
\n",
"
\n",
"
40.46
\n",
"
-3.75
\n",
"
11508309
\n",
"
confirmed
\n",
"
2022-03-30 00:00:00+00:00
\n",
"
\n",
"
\n",
"
799
\n",
"
Spain
\n",
"
ES
\n",
"
\n",
"
\n",
"
\n",
"
40.46
\n",
"
-3.75
\n",
"
11508309
\n",
"
confirmed
\n",
"
2022-03-31 00:00:00+00:00
\n",
"
\n",
"
\n",
"
800
\n",
"
Spain
\n",
"
ES
\n",
"
\n",
"
\n",
"
\n",
"
40.46
\n",
"
-3.75
\n",
"
11551574
\n",
"
confirmed
\n",
"
2022-04-01 00:00:00+00:00
\n",
"
\n",
"
\n",
"
801
\n",
"
Spain
\n",
"
ES
\n",
"
\n",
"
\n",
"
\n",
"
40.46
\n",
"
-3.75
\n",
"
11551574
\n",
"
confirmed
\n",
"
2022-04-02 00:00:00+00:00
\n",
"
\n",
"
\n",
"
802
\n",
"
Spain
\n",
"
ES
\n",
"
\n",
"
\n",
"
\n",
"
40.46
\n",
"
-3.75
\n",
"
11551574
\n",
"
confirmed
\n",
"
2022-04-03 00:00:00+00:00
\n",
"
\n",
" \n",
"
\n",
"
803 rows × 10 columns
\n",
"
"
],
"text/plain": [
" Country CountryCode Province City CityCode Lat Lon Cases \\\n",
"0 Spain ES 40.46 -3.75 0 \n",
"1 Spain ES 40.46 -3.75 0 \n",
"2 Spain ES 40.46 -3.75 0 \n",
"3 Spain ES 40.46 -3.75 0 \n",
"4 Spain ES 40.46 -3.75 0 \n",
".. ... ... ... ... ... ... ... ... \n",
"798 Spain ES 40.46 -3.75 11508309 \n",
"799 Spain ES 40.46 -3.75 11508309 \n",
"800 Spain ES 40.46 -3.75 11551574 \n",
"801 Spain ES 40.46 -3.75 11551574 \n",
"802 Spain ES 40.46 -3.75 11551574 \n",
"\n",
" Status Date \n",
"0 confirmed 2020-01-22 00:00:00+00:00 \n",
"1 confirmed 2020-01-23 00:00:00+00:00 \n",
"2 confirmed 2020-01-24 00:00:00+00:00 \n",
"3 confirmed 2020-01-25 00:00:00+00:00 \n",
"4 confirmed 2020-01-26 00:00:00+00:00 \n",
".. ... ... \n",
"798 confirmed 2022-03-30 00:00:00+00:00 \n",
"799 confirmed 2022-03-31 00:00:00+00:00 \n",
"800 confirmed 2022-04-01 00:00:00+00:00 \n",
"801 confirmed 2022-04-02 00:00:00+00:00 \n",
"802 confirmed 2022-04-03 00:00:00+00:00 \n",
"\n",
"[803 rows x 10 columns]"
]
},
"execution_count": 16,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"url_es='https://api.covid19api.com/country/spain/status/confirmed/live'\n",
"df_es= pd.read_json(url_es)\n",
"df_es\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Obtenemos el nombre de cada columna mediante la función 'columns'"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Index(['Country', 'CountryCode', 'Province', 'City', 'CityCode', 'Lat', 'Lon',\n",
" 'Cases', 'Status', 'Date'],\n",
" dtype='object')"
]
},
"execution_count": 17,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df_es.columns\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Con la función info obtenemos un resumen del dataframe, como en este caso el número total de columnas (vemos que son un total de 10) o la memoria usada"
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"RangeIndex: 803 entries, 0 to 802\n",
"Data columns (total 10 columns):\n",
" # Column Non-Null Count Dtype \n",
"--- ------ -------------- ----- \n",
" 0 Country 803 non-null object \n",
" 1 CountryCode 803 non-null object \n",
" 2 Province 803 non-null object \n",
" 3 City 803 non-null object \n",
" 4 CityCode 803 non-null object \n",
" 5 Lat 803 non-null float64 \n",
" 6 Lon 803 non-null float64 \n",
" 7 Cases 803 non-null int64 \n",
" 8 Status 803 non-null object \n",
" 9 Date 803 non-null datetime64[ns, UTC]\n",
"dtypes: datetime64[ns, UTC](1), float64(2), int64(1), object(6)\n",
"memory usage: 62.9+ KB\n"
]
}
],
"source": [
"df_es.info()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Para obtener información sobre valores numéricos de una columna utilizamos la función describe"
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"