{ "cells": [ { "cell_type": "markdown", "id": "3f90ea29", "metadata": {}, "source": [ "# Sortieren und Ranking\n", "\n", "Das Sortieren eines Datensatzes nach einem Kriterium ist eine weitere wichtige eingebaute Funktion. Um lexikografisch nach Zeilen- oder Spaltenindex zu sortieren, verwendet die Methoden [pandas.Series.sort_index](https://pandas.pydata.org/docs/reference/api/pandas.Series.sort_index.html) oder [pandas.DataFrame.sort_index](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.sort_index.html), die ein neues, sortiertes Objekt zurückgibt. Mit `ascending=False` wird die Sortierreihenfolge umgekehrt:" ] }, { "cell_type": "code", "execution_count": 1, "id": "494c8d07", "metadata": { "execution": { "iopub.execute_input": "2026-05-21T13:56:04.344054Z", "iopub.status.busy": "2026-05-21T13:56:04.343220Z", "iopub.status.idle": "2026-05-21T13:56:04.574440Z", "shell.execute_reply": "2026-05-21T13:56:04.574036Z", "shell.execute_reply.started": "2026-05-21T13:56:04.344028Z" } }, "outputs": [ { "data": { "text/plain": [ "6 0.849312\n", "5 -0.446785\n", "4 -1.180987\n", "3 1.739605\n", "2 1.535317\n", "1 -0.352107\n", "0 0.755917\n", "dtype: float64" ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import numpy as np\n", "import pandas as pd\n", "\n", "\n", "rng = np.random.default_rng()\n", "s = pd.Series(rng.normal(size=7))\n", "\n", "s.sort_index(ascending=False)" ] }, { "cell_type": "markdown", "id": "cc386834", "metadata": {}, "source": [ "Um eine Serie nach ihren Werten zu sortieren, könnt ihr die `sort_values`-Methode verwenden:" ] }, { "cell_type": "code", "execution_count": 2, "id": "fadf1cd3", "metadata": { "execution": { "iopub.execute_input": "2026-05-21T13:56:04.575215Z", "iopub.status.busy": "2026-05-21T13:56:04.575086Z", "iopub.status.idle": "2026-05-21T13:56:04.578186Z", "shell.execute_reply": "2026-05-21T13:56:04.577829Z", "shell.execute_reply.started": "2026-05-21T13:56:04.575206Z" } }, "outputs": [ { "data": { "text/plain": [ "4 -1.180987\n", "5 -0.446785\n", "1 -0.352107\n", "0 0.755917\n", "6 0.849312\n", "2 1.535317\n", "3 1.739605\n", "dtype: float64" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s.sort_values()" ] }, { "cell_type": "markdown", "id": "70aa0dfe", "metadata": {}, "source": [ "Alle fehlenden Werte werden standardmäßig an das Ende der Reihe sortiert:" ] }, { "cell_type": "code", "execution_count": 3, "id": "ac6aac28", "metadata": { "execution": { "iopub.execute_input": "2026-05-21T13:56:04.578894Z", "iopub.status.busy": "2026-05-21T13:56:04.578798Z", "iopub.status.idle": "2026-05-21T13:56:04.582391Z", "shell.execute_reply": "2026-05-21T13:56:04.582109Z", "shell.execute_reply.started": "2026-05-21T13:56:04.578886Z" } }, "outputs": [ { "data": { "text/plain": [ "5 0.334075\n", "6 1.507886\n", "0 1.549615\n", "1 NaN\n", "2 NaN\n", "3 NaN\n", "4 NaN\n", "dtype: float64" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s = pd.Series(rng.normal(size=7))\n", "s[s < 0] = np.nan\n", "\n", "s.sort_values()" ] }, { "cell_type": "markdown", "id": "375693b4", "metadata": {}, "source": [ "Mit einem DataFrame können ihr auf beiden Achsen nach Index sortieren:" ] }, { "cell_type": "code", "execution_count": 4, "id": "7c8c1daf", "metadata": { "execution": { "iopub.execute_input": "2026-05-21T13:56:04.583093Z", "iopub.status.busy": "2026-05-21T13:56:04.582878Z", "iopub.status.idle": "2026-05-21T13:56:04.586842Z", "shell.execute_reply": "2026-05-21T13:56:04.586553Z", "shell.execute_reply.started": "2026-05-21T13:56:04.583086Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
012
00.093003-0.749562-0.492838
1-0.628586-0.402284-2.079898
20.166077-0.8792920.970397
3-1.2810800.9982370.099407
4-1.5550772.524556-0.291195
5-1.5060270.2787021.187874
60.461046-0.3618451.671595
\n", "
" ], "text/plain": [ " 0 1 2\n", "0 0.093003 -0.749562 -0.492838\n", "1 -0.628586 -0.402284 -2.079898\n", "2 0.166077 -0.879292 0.970397\n", "3 -1.281080 0.998237 0.099407\n", "4 -1.555077 2.524556 -0.291195\n", "5 -1.506027 0.278702 1.187874\n", "6 0.461046 -0.361845 1.671595" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df = pd.DataFrame(rng.normal(size=(7, 3)))\n", "\n", "df" ] }, { "cell_type": "code", "execution_count": 5, "id": "b6ce411c", "metadata": { "execution": { "iopub.execute_input": "2026-05-21T13:56:04.587400Z", "iopub.status.busy": "2026-05-21T13:56:04.587309Z", "iopub.status.idle": "2026-05-21T13:56:04.590570Z", "shell.execute_reply": "2026-05-21T13:56:04.590249Z", "shell.execute_reply.started": "2026-05-21T13:56:04.587393Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
012
60.461046-0.3618451.671595
5-1.5060270.2787021.187874
4-1.5550772.524556-0.291195
3-1.2810800.9982370.099407
20.166077-0.8792920.970397
1-0.628586-0.402284-2.079898
00.093003-0.749562-0.492838
\n", "
" ], "text/plain": [ " 0 1 2\n", "6 0.461046 -0.361845 1.671595\n", "5 -1.506027 0.278702 1.187874\n", "4 -1.555077 2.524556 -0.291195\n", "3 -1.281080 0.998237 0.099407\n", "2 0.166077 -0.879292 0.970397\n", "1 -0.628586 -0.402284 -2.079898\n", "0 0.093003 -0.749562 -0.492838" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.sort_index(ascending=False)" ] }, { "cell_type": "code", "execution_count": 6, "id": "972152ec", "metadata": { "execution": { "iopub.execute_input": "2026-05-21T13:56:04.592202Z", "iopub.status.busy": "2026-05-21T13:56:04.592081Z", "iopub.status.idle": "2026-05-21T13:56:04.595706Z", "shell.execute_reply": "2026-05-21T13:56:04.595436Z", "shell.execute_reply.started": "2026-05-21T13:56:04.592189Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
210
0-0.492838-0.7495620.093003
1-2.079898-0.402284-0.628586
20.970397-0.8792920.166077
30.0994070.998237-1.281080
4-0.2911952.524556-1.555077
51.1878740.278702-1.506027
61.671595-0.3618450.461046
\n", "
" ], "text/plain": [ " 2 1 0\n", "0 -0.492838 -0.749562 0.093003\n", "1 -2.079898 -0.402284 -0.628586\n", "2 0.970397 -0.879292 0.166077\n", "3 0.099407 0.998237 -1.281080\n", "4 -0.291195 2.524556 -1.555077\n", "5 1.187874 0.278702 -1.506027\n", "6 1.671595 -0.361845 0.461046" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.sort_index(axis=1, ascending=False)" ] }, { "cell_type": "markdown", "id": "64442bea", "metadata": {}, "source": [ "Beim Sortieren eines DataFrame könnt ihr die Daten in einer oder mehreren Spalten als Sortierschlüssel verwenden. Dazu übergebt ihr eine oder mehrere Spaltennamen an die Option `by` von `sort_values`:" ] }, { "cell_type": "code", "execution_count": 7, "id": "a31b3904", "metadata": { "execution": { "iopub.execute_input": "2026-05-21T13:56:04.596225Z", "iopub.status.busy": "2026-05-21T13:56:04.596090Z", "iopub.status.idle": "2026-05-21T13:56:04.599413Z", "shell.execute_reply": "2026-05-21T13:56:04.599169Z", "shell.execute_reply.started": "2026-05-21T13:56:04.596215Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
012
1-0.628586-0.402284-2.079898
00.093003-0.749562-0.492838
4-1.5550772.524556-0.291195
3-1.2810800.9982370.099407
20.166077-0.8792920.970397
5-1.5060270.2787021.187874
60.461046-0.3618451.671595
\n", "
" ], "text/plain": [ " 0 1 2\n", "1 -0.628586 -0.402284 -2.079898\n", "0 0.093003 -0.749562 -0.492838\n", "4 -1.555077 2.524556 -0.291195\n", "3 -1.281080 0.998237 0.099407\n", "2 0.166077 -0.879292 0.970397\n", "5 -1.506027 0.278702 1.187874\n", "6 0.461046 -0.361845 1.671595" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.sort_values(by=2)" ] }, { "cell_type": "markdown", "id": "853e1380", "metadata": {}, "source": [ "Um nach mehreren Spalten zu sortieren, könnt ihr eine Liste von Namen übergeben." ] }, { "cell_type": "markdown", "id": "b7c41a18", "metadata": {}, "source": [ "Ranking weist Ränge von eins bis zur Anzahl der gültigen Datenpunkte in einem Array zu:" ] }, { "cell_type": "code", "execution_count": 8, "id": "cfd396dd", "metadata": { "execution": { "iopub.execute_input": "2026-05-21T13:56:04.599978Z", "iopub.status.busy": "2026-05-21T13:56:04.599896Z", "iopub.status.idle": "2026-05-21T13:56:04.603608Z", "shell.execute_reply": "2026-05-21T13:56:04.603320Z", "shell.execute_reply.started": "2026-05-21T13:56:04.599972Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
012
05.02.02.0
14.03.01.0
26.01.05.0
33.06.04.0
41.07.03.0
52.05.06.0
67.04.07.0
\n", "
" ], "text/plain": [ " 0 1 2\n", "0 5.0 2.0 2.0\n", "1 4.0 3.0 1.0\n", "2 6.0 1.0 5.0\n", "3 3.0 6.0 4.0\n", "4 1.0 7.0 3.0\n", "5 2.0 5.0 6.0\n", "6 7.0 4.0 7.0" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.rank()" ] }, { "cell_type": "markdown", "id": "6fe180d8", "metadata": {}, "source": [ "Wenn beim Ranking Gleichstände auftauchen, weist `rank` jeder Gruppe den mittleren Rang zu." ] }, { "cell_type": "code", "execution_count": 9, "id": "0cec47c0", "metadata": { "execution": { "iopub.execute_input": "2026-05-21T13:56:04.604017Z", "iopub.status.busy": "2026-05-21T13:56:04.603939Z", "iopub.status.idle": "2026-05-21T13:56:04.607463Z", "shell.execute_reply": "2026-05-21T13:56:04.607231Z", "shell.execute_reply.started": "2026-05-21T13:56:04.604010Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
012
05.02.02.0
14.03.01.0
26.01.05.0
33.06.04.0
41.07.03.0
52.05.06.0
67.04.07.0
\n", "
" ], "text/plain": [ " 0 1 2\n", "0 5.0 2.0 2.0\n", "1 4.0 3.0 1.0\n", "2 6.0 1.0 5.0\n", "3 3.0 6.0 4.0\n", "4 1.0 7.0 3.0\n", "5 2.0 5.0 6.0\n", "6 7.0 4.0 7.0" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.rank(method=\"max\")" ] }, { "cell_type": "markdown", "id": "63fd4e17", "metadata": {}, "source": [ "## Weitere Verfahren mit `rank`\n", "\n", "Methode | Beschreibung\n", ":------ | :-----------\n", "`average` | Standard: jedem Eintrag in der gleichen Gruppe den durchschnittlichen Rang zuweisen\n", "`min` | verwendet den minimalen Rang für die gesamte Gruppe\n", "`max` | verwendet den maximalen Rang für die gesamte Gruppe\n", "`first` | weist die Ränge in der Reihenfolge zu, in der die Werte in den Daten erscheinen\n", "`dense` | wie `method='min'`, aber die Ränge erhöhen sich zwischen den Gruppen immer um 1 und nicht nach der Anzahl der gleichen Elemente in einer Gruppe" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3.13 Kernel", "language": "python", "name": "python313" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.13.0" }, "widgets": { "application/vnd.jupyter.widget-state+json": { "state": {}, "version_major": 2, "version_minor": 0 } } }, "nbformat": 4, "nbformat_minor": 5 }