{ "cells": [ { "cell_type": "markdown", "id": "b95ac834", "metadata": {}, "source": [ "# Daten auswählen und filtern\n", "\n", "Die Indizierung von Serien `(obj[...])` funktioniert analog zur Indizierung von NumPy-Arrays, außer dass ihr Indexwerte der Serie statt nur Ganzzahlen verwenden könnt. Hier sind einige Beispiele dafür:" ] }, { "cell_type": "code", "execution_count": 1, "id": "3482a24f", "metadata": { "execution": { "iopub.execute_input": "2026-05-21T14:01:26.139473Z", "iopub.status.busy": "2026-05-21T14:01:26.139280Z", "iopub.status.idle": "2026-05-21T14:01:26.361441Z", "shell.execute_reply": "2026-05-21T14:01:26.361024Z", "shell.execute_reply.started": "2026-05-21T14:01:26.139454Z" } }, "outputs": [], "source": [ "import numpy as np\n", "import pandas as pd" ] }, { "cell_type": "code", "execution_count": 2, "id": "f4edf9bc", "metadata": { "execution": { "iopub.execute_input": "2026-05-21T14:01:26.362078Z", "iopub.status.busy": "2026-05-21T14:01:26.361958Z", "iopub.status.idle": "2026-05-21T14:01:26.365121Z", "shell.execute_reply": "2026-05-21T14:01:26.364698Z", "shell.execute_reply.started": "2026-05-21T14:01:26.362069Z" } }, "outputs": [], "source": [ "idx = pd.date_range(\"2022-02-02\", periods=7)\n", "rng = np.random.default_rng()\n", "s = pd.Series(rng.normal(size=7), index=idx)" ] }, { "cell_type": "code", "execution_count": 3, "id": "237188da", "metadata": { "execution": { "iopub.execute_input": "2026-05-21T14:01:26.365625Z", "iopub.status.busy": "2026-05-21T14:01:26.365547Z", "iopub.status.idle": "2026-05-21T14:01:26.370366Z", "shell.execute_reply": "2026-05-21T14:01:26.370041Z", "shell.execute_reply.started": "2026-05-21T14:01:26.365618Z" } }, "outputs": [ { "data": { "text/plain": [ "2022-02-02 -1.143049\n", "2022-02-03 0.371882\n", "2022-02-04 -0.739300\n", "2022-02-05 0.216581\n", "2022-02-06 -0.153057\n", "2022-02-07 -1.024227\n", "2022-02-08 1.677115\n", "Freq: D, dtype: float64" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s" ] }, { "cell_type": "code", "execution_count": 4, "id": "1b12fbaa", "metadata": { "execution": { "iopub.execute_input": "2026-05-21T14:01:26.370761Z", "iopub.status.busy": "2026-05-21T14:01:26.370683Z", "iopub.status.idle": "2026-05-21T14:01:26.373447Z", "shell.execute_reply": "2026-05-21T14:01:26.373171Z", "shell.execute_reply.started": "2026-05-21T14:01:26.370755Z" } }, "outputs": [ { "data": { "text/plain": [ "np.float64(0.3718820858366448)" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s[\"2022-02-03\"]" ] }, { "cell_type": "code", "execution_count": 5, "id": "7da4fd93", "metadata": { "execution": { "iopub.execute_input": "2026-05-21T14:01:26.373972Z", "iopub.status.busy": "2026-05-21T14:01:26.373886Z", "iopub.status.idle": "2026-05-21T14:01:26.376142Z", "shell.execute_reply": "2026-05-21T14:01:26.375911Z", "shell.execute_reply.started": "2026-05-21T14:01:26.373965Z" } }, "outputs": [ { "data": { "text/plain": [ "np.float64(0.3718820858366448)" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s.iloc[1]" ] }, { "cell_type": "code", "execution_count": 6, "id": "9531e948", "metadata": { "execution": { "iopub.execute_input": "2026-05-21T14:01:26.378243Z", "iopub.status.busy": "2026-05-21T14:01:26.378111Z", "iopub.status.idle": "2026-05-21T14:01:26.380682Z", "shell.execute_reply": "2026-05-21T14:01:26.380393Z", "shell.execute_reply.started": "2026-05-21T14:01:26.378235Z" } }, "outputs": [ { "data": { "text/plain": [ "2022-02-04 -0.739300\n", "2022-02-05 0.216581\n", "Freq: D, dtype: float64" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s[2:4]" ] }, { "cell_type": "code", "execution_count": 7, "id": "c3b587c0", "metadata": { "execution": { "iopub.execute_input": "2026-05-21T14:01:26.381189Z", "iopub.status.busy": "2026-05-21T14:01:26.381105Z", "iopub.status.idle": "2026-05-21T14:01:26.384609Z", "shell.execute_reply": "2026-05-21T14:01:26.384147Z", "shell.execute_reply.started": "2026-05-21T14:01:26.381182Z" } }, "outputs": [ { "data": { "text/plain": [ "2022-02-04 -0.739300\n", "2022-02-03 0.371882\n", "2022-02-02 -1.143049\n", "dtype: float64" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s[[\"2022-02-04\", \"2022-02-03\", \"2022-02-02\"]]" ] }, { "cell_type": "code", "execution_count": 8, "id": "31df8ae7", "metadata": { "execution": { "iopub.execute_input": "2026-05-21T14:01:26.385275Z", "iopub.status.busy": "2026-05-21T14:01:26.385188Z", "iopub.status.idle": "2026-05-21T14:01:26.387811Z", "shell.execute_reply": "2026-05-21T14:01:26.387463Z", "shell.execute_reply.started": "2026-05-21T14:01:26.385267Z" } }, "outputs": [ { "data": { "text/plain": [ "2022-02-03 0.371882\n", "2022-02-05 0.216581\n", "Freq: 2D, dtype: float64" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s.iloc[[1, 3]]" ] }, { "cell_type": "code", "execution_count": 9, "id": "4d612a38-6996-4e8d-ab5d-7b125ae66099", "metadata": { "execution": { "iopub.execute_input": "2026-05-21T14:01:26.388463Z", "iopub.status.busy": "2026-05-21T14:01:26.388251Z", "iopub.status.idle": "2026-05-21T14:01:26.390807Z", "shell.execute_reply": "2026-05-21T14:01:26.390553Z", "shell.execute_reply.started": "2026-05-21T14:01:26.388455Z" } }, "outputs": [ { "data": { "text/plain": [ "2022-02-03 0.371882\n", "2022-02-05 0.216581\n", "2022-02-08 1.677115\n", "dtype: float64" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s[s > 0]" ] }, { "cell_type": "markdown", "id": "82b5ebef", "metadata": {}, "source": [ "Zwar könnt ihr auf diese Weise Daten nach Label auswählen, doch die bevorzugte Methode zur Auswahl von Indexwerten ist der `loc`-Operator:" ] }, { "cell_type": "code", "execution_count": 10, "id": "0e11f024", "metadata": { "execution": { "iopub.execute_input": "2026-05-21T14:01:26.391502Z", "iopub.status.busy": "2026-05-21T14:01:26.391235Z", "iopub.status.idle": "2026-05-21T14:01:26.394889Z", "shell.execute_reply": "2026-05-21T14:01:26.394658Z", "shell.execute_reply.started": "2026-05-21T14:01:26.391491Z" } }, "outputs": [ { "data": { "text/plain": [ "2022-02-04 -0.739300\n", "2022-02-03 0.371882\n", "2022-02-02 -1.143049\n", "dtype: float64" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s.loc[[\"2022-02-04\", \"2022-02-03\", \"2022-02-02\"]]" ] }, { "cell_type": "markdown", "id": "7b3b6f16", "metadata": {}, "source": [ "Der Grund für die Bevorzugung von `loc` liegt in der unterschiedlichen Behandlung von Ganzzahlen bei der Indexierung mit `[]`. Bei der regulären `[]`-basierten Indizierung werden Ganzzahlen als Label behandelt, wenn der Index Ganzzahlen enthält, so dass das Verhalten je nach Datentyp des Index unterschiedlich ist. In unserem Beispiel wird der Ausdruck `s.loc[[3, 2, 1]]` fehlschlagen, da der Index keine ganzen Zahlen enthält:" ] }, { "cell_type": "code", "execution_count": 11, "id": "c7279faf", "metadata": { "execution": { "iopub.execute_input": "2026-05-21T14:01:26.395384Z", "iopub.status.busy": "2026-05-21T14:01:26.395303Z", "iopub.status.idle": "2026-05-21T14:01:26.611082Z", "shell.execute_reply": "2026-05-21T14:01:26.610116Z", "shell.execute_reply.started": "2026-05-21T14:01:26.395377Z" } }, "outputs": [ { "ename": "KeyError", "evalue": "\"None of [Index([3, 2, 1], dtype='int64')] are in the [index]\"", "output_type": "error", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mKeyError\u001b[0m Traceback (most recent call last)", "Cell \u001b[0;32mIn[11], line 1\u001b[0m\n\u001b[0;32m----> 1\u001b[0m \u001b[43ms\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mloc\u001b[49m\u001b[43m[\u001b[49m\u001b[43m[\u001b[49m\u001b[38;5;241;43m3\u001b[39;49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m2\u001b[39;49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m1\u001b[39;49m\u001b[43m]\u001b[49m\u001b[43m]\u001b[49m\n", "File \u001b[0;32m~/cusy/trn/jupyter-tutorial/uvenvs/py313/.venv/lib/python3.13/site-packages/pandas/core/indexing.py:1191\u001b[0m, in \u001b[0;36m_LocationIndexer.__getitem__\u001b[0;34m(self, key)\u001b[0m\n\u001b[1;32m 1189\u001b[0m maybe_callable \u001b[38;5;241m=\u001b[39m com\u001b[38;5;241m.\u001b[39mapply_if_callable(key, \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mobj)\n\u001b[1;32m 1190\u001b[0m maybe_callable \u001b[38;5;241m=\u001b[39m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_check_deprecated_callable_usage(key, maybe_callable)\n\u001b[0;32m-> 1191\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43m_getitem_axis\u001b[49m\u001b[43m(\u001b[49m\u001b[43mmaybe_callable\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43maxis\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43maxis\u001b[49m\u001b[43m)\u001b[49m\n", "File \u001b[0;32m~/cusy/trn/jupyter-tutorial/uvenvs/py313/.venv/lib/python3.13/site-packages/pandas/core/indexing.py:1420\u001b[0m, in \u001b[0;36m_LocIndexer._getitem_axis\u001b[0;34m(self, key, axis)\u001b[0m\n\u001b[1;32m 1417\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m \u001b[38;5;28mhasattr\u001b[39m(key, \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mndim\u001b[39m\u001b[38;5;124m\"\u001b[39m) \u001b[38;5;129;01mand\u001b[39;00m key\u001b[38;5;241m.\u001b[39mndim \u001b[38;5;241m>\u001b[39m \u001b[38;5;241m1\u001b[39m:\n\u001b[1;32m 1418\u001b[0m \u001b[38;5;28;01mraise\u001b[39;00m \u001b[38;5;167;01mValueError\u001b[39;00m(\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mCannot index with multidimensional key\u001b[39m\u001b[38;5;124m\"\u001b[39m)\n\u001b[0;32m-> 1420\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43m_getitem_iterable\u001b[49m\u001b[43m(\u001b[49m\u001b[43mkey\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43maxis\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43maxis\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 1422\u001b[0m \u001b[38;5;66;03m# nested tuple slicing\u001b[39;00m\n\u001b[1;32m 1423\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m is_nested_tuple(key, labels):\n", "File \u001b[0;32m~/cusy/trn/jupyter-tutorial/uvenvs/py313/.venv/lib/python3.13/site-packages/pandas/core/indexing.py:1360\u001b[0m, in \u001b[0;36m_LocIndexer._getitem_iterable\u001b[0;34m(self, key, axis)\u001b[0m\n\u001b[1;32m 1357\u001b[0m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_validate_key(key, axis)\n\u001b[1;32m 1359\u001b[0m \u001b[38;5;66;03m# A collection of keys\u001b[39;00m\n\u001b[0;32m-> 1360\u001b[0m keyarr, indexer \u001b[38;5;241m=\u001b[39m \u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43m_get_listlike_indexer\u001b[49m\u001b[43m(\u001b[49m\u001b[43mkey\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43maxis\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 1361\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mobj\u001b[38;5;241m.\u001b[39m_reindex_with_indexers(\n\u001b[1;32m 1362\u001b[0m {axis: [keyarr, indexer]}, copy\u001b[38;5;241m=\u001b[39m\u001b[38;5;28;01mTrue\u001b[39;00m, allow_dups\u001b[38;5;241m=\u001b[39m\u001b[38;5;28;01mTrue\u001b[39;00m\n\u001b[1;32m 1363\u001b[0m )\n", "File \u001b[0;32m~/cusy/trn/jupyter-tutorial/uvenvs/py313/.venv/lib/python3.13/site-packages/pandas/core/indexing.py:1558\u001b[0m, in \u001b[0;36m_LocIndexer._get_listlike_indexer\u001b[0;34m(self, key, axis)\u001b[0m\n\u001b[1;32m 1555\u001b[0m ax \u001b[38;5;241m=\u001b[39m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mobj\u001b[38;5;241m.\u001b[39m_get_axis(axis)\n\u001b[1;32m 1556\u001b[0m axis_name \u001b[38;5;241m=\u001b[39m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mobj\u001b[38;5;241m.\u001b[39m_get_axis_name(axis)\n\u001b[0;32m-> 1558\u001b[0m keyarr, indexer \u001b[38;5;241m=\u001b[39m \u001b[43max\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43m_get_indexer_strict\u001b[49m\u001b[43m(\u001b[49m\u001b[43mkey\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43maxis_name\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 1560\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m keyarr, indexer\n", "File \u001b[0;32m~/cusy/trn/jupyter-tutorial/uvenvs/py313/.venv/lib/python3.13/site-packages/pandas/core/indexes/base.py:6200\u001b[0m, in \u001b[0;36mIndex._get_indexer_strict\u001b[0;34m(self, key, axis_name)\u001b[0m\n\u001b[1;32m 6197\u001b[0m \u001b[38;5;28;01melse\u001b[39;00m:\n\u001b[1;32m 6198\u001b[0m keyarr, indexer, new_indexer \u001b[38;5;241m=\u001b[39m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_reindex_non_unique(keyarr)\n\u001b[0;32m-> 6200\u001b[0m \u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43m_raise_if_missing\u001b[49m\u001b[43m(\u001b[49m\u001b[43mkeyarr\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mindexer\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43maxis_name\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 6202\u001b[0m keyarr \u001b[38;5;241m=\u001b[39m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mtake(indexer)\n\u001b[1;32m 6203\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m \u001b[38;5;28misinstance\u001b[39m(key, Index):\n\u001b[1;32m 6204\u001b[0m \u001b[38;5;66;03m# GH 42790 - Preserve name from an Index\u001b[39;00m\n", "File \u001b[0;32m~/cusy/trn/jupyter-tutorial/uvenvs/py313/.venv/lib/python3.13/site-packages/pandas/core/indexes/base.py:6249\u001b[0m, in \u001b[0;36mIndex._raise_if_missing\u001b[0;34m(self, key, indexer, axis_name)\u001b[0m\n\u001b[1;32m 6247\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m nmissing:\n\u001b[1;32m 6248\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m nmissing \u001b[38;5;241m==\u001b[39m \u001b[38;5;28mlen\u001b[39m(indexer):\n\u001b[0;32m-> 6249\u001b[0m \u001b[38;5;28;01mraise\u001b[39;00m \u001b[38;5;167;01mKeyError\u001b[39;00m(\u001b[38;5;124mf\u001b[39m\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mNone of [\u001b[39m\u001b[38;5;132;01m{\u001b[39;00mkey\u001b[38;5;132;01m}\u001b[39;00m\u001b[38;5;124m] are in the [\u001b[39m\u001b[38;5;132;01m{\u001b[39;00maxis_name\u001b[38;5;132;01m}\u001b[39;00m\u001b[38;5;124m]\u001b[39m\u001b[38;5;124m\"\u001b[39m)\n\u001b[1;32m 6251\u001b[0m not_found \u001b[38;5;241m=\u001b[39m \u001b[38;5;28mlist\u001b[39m(ensure_index(key)[missing_mask\u001b[38;5;241m.\u001b[39mnonzero()[\u001b[38;5;241m0\u001b[39m]]\u001b[38;5;241m.\u001b[39munique())\n\u001b[1;32m 6252\u001b[0m \u001b[38;5;28;01mraise\u001b[39;00m \u001b[38;5;167;01mKeyError\u001b[39;00m(\u001b[38;5;124mf\u001b[39m\u001b[38;5;124m\"\u001b[39m\u001b[38;5;132;01m{\u001b[39;00mnot_found\u001b[38;5;132;01m}\u001b[39;00m\u001b[38;5;124m not in index\u001b[39m\u001b[38;5;124m\"\u001b[39m)\n", "\u001b[0;31mKeyError\u001b[0m: \"None of [Index([3, 2, 1], dtype='int64')] are in the [index]\"" ] } ], "source": [ "s.loc[[3, 2, 1]]" ] }, { "cell_type": "markdown", "id": "ca01bbcf", "metadata": {}, "source": [ "Während der `loc`-Operator ausschließlich Label indiziert, indiziert der `iloc`-Operator ausschließlich mit ganzen Zahlen:" ] }, { "cell_type": "code", "execution_count": 12, "id": "37e0ba74", "metadata": { "execution": { "iopub.execute_input": "2026-05-21T14:01:39.857055Z", "iopub.status.busy": "2026-05-21T14:01:39.856495Z", "iopub.status.idle": "2026-05-21T14:01:39.864232Z", "shell.execute_reply": "2026-05-21T14:01:39.863631Z", "shell.execute_reply.started": "2026-05-21T14:01:39.857010Z" } }, "outputs": [ { "data": { "text/plain": [ "2022-02-05 0.216581\n", "2022-02-04 -0.739300\n", "2022-02-03 0.371882\n", "Freq: -1D, dtype: float64" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s.iloc[[3, 2, 1]]" ] }, { "cell_type": "markdown", "id": "e5dbc895", "metadata": {}, "source": [ "Ihr könnt auch mit Labels slicen, aber das funktioniert anders als das normale Python-Slicing, da der Endpunkt inklusive ist:" ] }, { "cell_type": "code", "execution_count": 13, "id": "2121259e", "metadata": { "execution": { "iopub.execute_input": "2026-05-21T14:01:39.865895Z", "iopub.status.busy": "2026-05-21T14:01:39.865680Z", "iopub.status.idle": "2026-05-21T14:01:39.872653Z", "shell.execute_reply": "2026-05-21T14:01:39.872225Z", "shell.execute_reply.started": "2026-05-21T14:01:39.865879Z" } }, "outputs": [ { "data": { "text/plain": [ "2022-02-03 0.371882\n", "2022-02-04 -0.739300\n", "Freq: D, dtype: float64" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s.loc[\"2022-02-03\":\"2022-02-04\"]" ] }, { "cell_type": "markdown", "id": "3ebe649f", "metadata": {}, "source": [ "Durch die Einstellung mit diesen Methoden wird der entsprechende Abschnitt der Reihe geändert:" ] }, { "cell_type": "code", "execution_count": 14, "id": "4af5a091", "metadata": { "execution": { "iopub.execute_input": "2026-05-21T14:01:39.873419Z", "iopub.status.busy": "2026-05-21T14:01:39.873174Z", "iopub.status.idle": "2026-05-21T14:01:39.878836Z", "shell.execute_reply": "2026-05-21T14:01:39.878372Z", "shell.execute_reply.started": "2026-05-21T14:01:39.873402Z" } }, "outputs": [ { "data": { "text/plain": [ "2022-02-02 -1.143049\n", "2022-02-03 0.000000\n", "2022-02-04 0.000000\n", "2022-02-05 0.216581\n", "2022-02-06 -0.153057\n", "2022-02-07 -1.024227\n", "2022-02-08 1.677115\n", "Freq: D, dtype: float64" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s.loc[\"2022-02-03\":\"2022-02-04\"] = 0\n", "\n", "s" ] }, { "cell_type": "markdown", "id": "2cdf7f54", "metadata": {}, "source": [ "Die Indizierung in einem DataFrame dient dazu, eine oder mehrere Spalten entweder mit einem einzelnen Wert oder einer Folge abzurufen:" ] }, { "cell_type": "code", "execution_count": 15, "id": "d2ac569b", "metadata": { "execution": { "iopub.execute_input": "2026-05-21T14:01:39.880485Z", "iopub.status.busy": "2026-05-21T14:01:39.880302Z", "iopub.status.idle": "2026-05-21T14:01:39.887882Z", "shell.execute_reply": "2026-05-21T14:01:39.887554Z", "shell.execute_reply.started": "2026-05-21T14:01:39.880470Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
DecimalOctalKey
Code
U+00000001NUL
U+00011002Ctrl-A
U+00022003Ctrl-B
U+00033004Ctrl-C
U+00044004Ctrl-D
U+00055005Ctrl-E
\n", "
" ], "text/plain": [ " Decimal Octal Key\n", "Code \n", "U+0000 0 001 NUL\n", "U+0001 1 002 Ctrl-A\n", "U+0002 2 003 Ctrl-B\n", "U+0003 3 004 Ctrl-C\n", "U+0004 4 004 Ctrl-D\n", "U+0005 5 005 Ctrl-E" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "data = {\n", " \"Code\": [\"U+0000\", \"U+0001\", \"U+0002\", \"U+0003\", \"U+0004\", \"U+0005\"],\n", " \"Decimal\": [0, 1, 2, 3, 4, 5],\n", " \"Octal\": [\"001\", \"002\", \"003\", \"004\", \"004\", \"005\"],\n", " \"Key\": [\"NUL\", \"Ctrl-A\", \"Ctrl-B\", \"Ctrl-C\", \"Ctrl-D\", \"Ctrl-E\"],\n", "}\n", "\n", "df = pd.DataFrame(data)\n", "df = pd.DataFrame(data, columns=[\"Decimal\", \"Octal\", \"Key\"], index=df[\"Code\"])\n", "\n", "df" ] }, { "cell_type": "code", "execution_count": 16, "id": "81c7bce1", "metadata": { "execution": { "iopub.execute_input": "2026-05-21T14:01:39.888329Z", "iopub.status.busy": "2026-05-21T14:01:39.888242Z", "iopub.status.idle": "2026-05-21T14:01:39.890809Z", "shell.execute_reply": "2026-05-21T14:01:39.890571Z", "shell.execute_reply.started": "2026-05-21T14:01:39.888322Z" } }, "outputs": [ { "data": { "text/plain": [ "Code\n", "U+0000 NUL\n", "U+0001 Ctrl-A\n", "U+0002 Ctrl-B\n", "U+0003 Ctrl-C\n", "U+0004 Ctrl-D\n", "U+0005 Ctrl-E\n", "Name: Key, dtype: object" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df[\"Key\"]" ] }, { "cell_type": "code", "execution_count": 17, "id": "923b8440", "metadata": { "execution": { "iopub.execute_input": "2026-05-21T14:01:39.891293Z", "iopub.status.busy": "2026-05-21T14:01:39.891220Z", "iopub.status.idle": "2026-05-21T14:01:39.894361Z", "shell.execute_reply": "2026-05-21T14:01:39.894097Z", "shell.execute_reply.started": "2026-05-21T14:01:39.891287Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
DecimalKey
Code
U+00000NUL
U+00011Ctrl-A
U+00022Ctrl-B
U+00033Ctrl-C
U+00044Ctrl-D
U+00055Ctrl-E
\n", "
" ], "text/plain": [ " Decimal Key\n", "Code \n", "U+0000 0 NUL\n", "U+0001 1 Ctrl-A\n", "U+0002 2 Ctrl-B\n", "U+0003 3 Ctrl-C\n", "U+0004 4 Ctrl-D\n", "U+0005 5 Ctrl-E" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df[[\"Decimal\", \"Key\"]]" ] }, { "cell_type": "markdown", "id": "b90336c6", "metadata": {}, "source": [ "Die Zeilenauswahlsyntax `df[:2]` wird aus Gründen der Bequemlichkeit bereitgestellt. Durch die Übergabe eines einzelnen Elements oder einer Liste an den `[]`-Operator werden Spalten ausgewählt." ] }, { "cell_type": "code", "execution_count": 18, "id": "72f5a3d0", "metadata": { "execution": { "iopub.execute_input": "2026-05-21T14:01:39.894833Z", "iopub.status.busy": "2026-05-21T14:01:39.894762Z", "iopub.status.idle": "2026-05-21T14:01:39.898445Z", "shell.execute_reply": "2026-05-21T14:01:39.898166Z", "shell.execute_reply.started": "2026-05-21T14:01:39.894826Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
DecimalOctalKey
Code
U+00000001NUL
U+00011002Ctrl-A
\n", "
" ], "text/plain": [ " Decimal Octal Key\n", "Code \n", "U+0000 0 001 NUL\n", "U+0001 1 002 Ctrl-A" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df[:2]" ] }, { "cell_type": "markdown", "id": "95042d47", "metadata": {}, "source": [ "Ein weiterer Anwendungsfall ist die Indizierung mit einem booleschen DataFrame, der beispielsweise durch einen Skalarvergleich erzeugt wird:" ] }, { "cell_type": "code", "execution_count": 19, "id": "edda0660", "metadata": { "execution": { "iopub.execute_input": "2026-05-21T14:01:39.898790Z", "iopub.status.busy": "2026-05-21T14:01:39.898722Z", "iopub.status.idle": "2026-05-21T14:01:39.901495Z", "shell.execute_reply": "2026-05-21T14:01:39.901292Z", "shell.execute_reply.started": "2026-05-21T14:01:39.898783Z" }, "scrolled": true }, "outputs": [ { "data": { "text/plain": [ "Code\n", "U+0000 False\n", "U+0001 False\n", "U+0002 False\n", "U+0003 True\n", "U+0004 True\n", "U+0005 True\n", "Name: Decimal, dtype: bool" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df[\"Decimal\"] > 2" ] }, { "cell_type": "code", "execution_count": 20, "id": "dc1ee2c4-58ec-4e35-adb4-37534c042fb4", "metadata": { "execution": { "iopub.execute_input": "2026-05-21T14:01:39.902120Z", "iopub.status.busy": "2026-05-21T14:01:39.901974Z", "iopub.status.idle": "2026-05-21T14:01:39.905301Z", "shell.execute_reply": "2026-05-21T14:01:39.905096Z", "shell.execute_reply.started": "2026-05-21T14:01:39.902106Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
DecimalOctalKey
Code
U+00033004Ctrl-C
U+00044004Ctrl-D
U+00055005Ctrl-E
\n", "
" ], "text/plain": [ " Decimal Octal Key\n", "Code \n", "U+0003 3 004 Ctrl-C\n", "U+0004 4 004 Ctrl-D\n", "U+0005 5 005 Ctrl-E" ] }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df[df[\"Decimal\"] > 2]" ] }, { "cell_type": "markdown", "id": "ade4d89f-b10b-4929-9e8a-3ac3083b4a03", "metadata": {}, "source": [ "Ihr könnt dies boolschen DataFrames auch mit bitweisen Operatoren verknüpfen:" ] }, { "cell_type": "code", "execution_count": 21, "id": "eab8dd55-1b70-4f58-beef-6aedc27dfabf", "metadata": { "execution": { "iopub.execute_input": "2026-05-21T14:01:39.906769Z", "iopub.status.busy": "2026-05-21T14:01:39.906672Z", "iopub.status.idle": "2026-05-21T14:01:39.910134Z", "shell.execute_reply": "2026-05-21T14:01:39.909827Z", "shell.execute_reply.started": "2026-05-21T14:01:39.906760Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
DecimalOctalKey
Code
U+00033004Ctrl-C
U+00044004Ctrl-D
\n", "
" ], "text/plain": [ " Decimal Octal Key\n", "Code \n", "U+0003 3 004 Ctrl-C\n", "U+0004 4 004 Ctrl-D" ] }, "execution_count": 21, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df[(df[\"Decimal\"] > 2) & (df[\"Decimal\"] < 5)]" ] }, { "cell_type": "code", "execution_count": 22, "id": "6f0d97c4-b4b6-44fe-9303-17744ce8d41f", "metadata": { "execution": { "iopub.execute_input": "2026-05-21T14:01:39.910646Z", "iopub.status.busy": "2026-05-21T14:01:39.910533Z", "iopub.status.idle": "2026-05-21T14:01:39.914373Z", "shell.execute_reply": "2026-05-21T14:01:39.913928Z", "shell.execute_reply.started": "2026-05-21T14:01:39.910635Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
DecimalOctalKey
Code
U+00000001NUL
U+00011002Ctrl-A
U+00022003Ctrl-B
U+00055005Ctrl-E
\n", "
" ], "text/plain": [ " Decimal Octal Key\n", "Code \n", "U+0000 0 001 NUL\n", "U+0001 1 002 Ctrl-A\n", "U+0002 2 003 Ctrl-B\n", "U+0005 5 005 Ctrl-E" ] }, "execution_count": 22, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df[(df[\"Decimal\"] < 3) | (df[\"Decimal\"] > 4)]" ] }, { "cell_type": "markdown", "id": "0edb24c2", "metadata": {}, "source": [ "Wie Series verfügt auch DataFrame über spezielle Operatoren `loc` und `iloc` für label-basierte bzw. ganzzahlige Indizierung. Da DataFrame zweidimensional ist, könnt ihr eine Teilmenge der Zeilen und Spalten mit NumPy-ähnlicher Notation auswählen, indem ihr entweder Achsenbeschriftungen (`loc`) oder Ganzzahlen (`iloc`) verwendet." ] }, { "cell_type": "code", "execution_count": 23, "id": "c8eb86f9", "metadata": { "execution": { "iopub.execute_input": "2026-05-21T14:01:39.914726Z", "iopub.status.busy": "2026-05-21T14:01:39.914659Z", "iopub.status.idle": "2026-05-21T14:01:39.917751Z", "shell.execute_reply": "2026-05-21T14:01:39.917439Z", "shell.execute_reply.started": "2026-05-21T14:01:39.914719Z" } }, "outputs": [ { "data": { "text/plain": [ "Decimal 2\n", "Key Ctrl-B\n", "Name: U+0002, dtype: object" ] }, "execution_count": 23, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.loc[\"U+0002\", [\"Decimal\", \"Key\"]]" ] }, { "cell_type": "code", "execution_count": 24, "id": "cc257cbc", "metadata": { "execution": { "iopub.execute_input": "2026-05-21T14:01:39.918214Z", "iopub.status.busy": "2026-05-21T14:01:39.918119Z", "iopub.status.idle": "2026-05-21T14:01:39.921132Z", "shell.execute_reply": "2026-05-21T14:01:39.920924Z", "shell.execute_reply.started": "2026-05-21T14:01:39.918207Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
OctalKey
Code
U+0002003Ctrl-B
\n", "
" ], "text/plain": [ " Octal Key\n", "Code \n", "U+0002 003 Ctrl-B" ] }, "execution_count": 24, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.iloc[[2], [1, 2]]" ] }, { "cell_type": "code", "execution_count": 25, "id": "04868a17", "metadata": { "execution": { "iopub.execute_input": "2026-05-21T14:01:39.921436Z", "iopub.status.busy": "2026-05-21T14:01:39.921378Z", "iopub.status.idle": "2026-05-21T14:01:39.924371Z", "shell.execute_reply": "2026-05-21T14:01:39.924167Z", "shell.execute_reply.started": "2026-05-21T14:01:39.921430Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
OctalKey
Code
U+0000001NUL
U+0001002Ctrl-A
\n", "
" ], "text/plain": [ " Octal Key\n", "Code \n", "U+0000 001 NUL\n", "U+0001 002 Ctrl-A" ] }, "execution_count": 25, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.iloc[[0, 1], [1, 2]]" ] }, { "cell_type": "markdown", "id": "06be621a", "metadata": {}, "source": [ "Beide Indizierungsfunktionen arbeiten mit Slices zusätzlich zu einzelnen Label oder Listen von Label:" ] }, { "cell_type": "code", "execution_count": 26, "id": "abb034d7", "metadata": { "execution": { "iopub.execute_input": "2026-05-21T14:01:39.924832Z", "iopub.status.busy": "2026-05-21T14:01:39.924689Z", "iopub.status.idle": "2026-05-21T14:01:39.927024Z", "shell.execute_reply": "2026-05-21T14:01:39.926835Z", "shell.execute_reply.started": "2026-05-21T14:01:39.924826Z" } }, "outputs": [ { "data": { "text/plain": [ "Code\n", "U+0000 NUL\n", "U+0001 Ctrl-A\n", "U+0002 Ctrl-B\n", "U+0003 Ctrl-C\n", "Name: Key, dtype: object" ] }, "execution_count": 26, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.loc[:\"U+0003\", \"Key\"]" ] }, { "cell_type": "code", "execution_count": 27, "id": "46ae3947", "metadata": { "execution": { "iopub.execute_input": "2026-05-21T14:01:39.927440Z", "iopub.status.busy": "2026-05-21T14:01:39.927367Z", "iopub.status.idle": "2026-05-21T14:01:39.930544Z", "shell.execute_reply": "2026-05-21T14:01:39.930040Z", "shell.execute_reply.started": "2026-05-21T14:01:39.927434Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
DecimalOctalKey
Code
U+00000001NUL
U+00011002Ctrl-A
U+00022003Ctrl-B
\n", "
" ], "text/plain": [ " Decimal Octal Key\n", "Code \n", "U+0000 0 001 NUL\n", "U+0001 1 002 Ctrl-A\n", "U+0002 2 003 Ctrl-B" ] }, "execution_count": 27, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.iloc[:3, :3]" ] }, { "cell_type": "markdown", "id": "48efc21c", "metadata": {}, "source": [ "Es gibt also viele Möglichkeiten, die in einem pandas-Objekt enthaltenen Daten auszuwählen und neu anzuordnen. Im folgenden stelle ich für DataFrames eine kurze Zusammenfassung der meisten dieser Möglichkeiten zusammen:\n", "\n", "Typ | Hinweis\n", ":-- | :------\n", "`df[LABEL]` | wählt eine einzelne Spalte oder eine Folge von Spalten aus dem DataFrame aus\n", "`df.loc[LABEL]` | wählt eine einzelne Zeile oder eine Teilmenge von Zeilen aus dem DataFrame nach Label aus\n", "`df.loc[:, LABEL]` | wählt eine einzelne Spalte oder eine Teilmenge von Spalten nach dem Label aus\n", "`df.loc[LABEL1, LABEL2]` | wählt sowohl Zeilen als auch Spalten nach dem Label aus\n", "`df.iloc[INTEGER]` | wählt eine einzelne Zeile oder eine Teilmenge von Zeilen aus dem DataFrame anhand der Ganzzahlposition aus\n", "`df.iloc[INTEGER1, INTEGER2]` | Wählt eine einzelne Spalte oder eine Teilmenge von Spalten anhand einer ganzzahligen Position aus\n", "`df.at[LABEL1, LABEL2]` | wählt einen Einzelwert nach Zeilen- und Spaltenbezeichnung aus\n", "`df.iat[INTEGER1, INTEGER2]` | wählt einen Einzelwert nach Zeilen- und Spaltenposition (Ganzzahlen) aus\n", "`reindex` NEW_INDEX | wählt Zeilen oder Spalten nach Labels aus\n", "`get_value, set_value` | veraltet seit Version 0.21.0: Verwendet stattdessen `.at[]` oder `.iat[]`." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3.13 Kernel", "language": "python", "name": "python313" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.13.0" }, "widgets": { "application/vnd.jupyter.widget-state+json": { "state": {}, "version_major": 2, "version_minor": 0 } } }, "nbformat": 4, "nbformat_minor": 5 }