BUG: `pd.Index.str.split` has an unexpected return type with `expand=True`

This issue has been created since 2022-11-23.

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

import pandas as pd
pd.Index([f'{i}_{i}' for i in range(10)]).str.split('__', expand=True)

Issue Description

Returns an index Index(['0_0', '1_1', '2_2', '3_3', '4_4', '5_5', '6_6', '7_7', '8_8', '9_9'], dtype='object')

Expected Behavior

Should return a single-level multi-index per the docs with expand=True.

Installed Versions

INSTALLED VERSIONS

commit : 8dab54d
python : 3.9.12.final.0
python-bits : 64
OS : Linux
OS-release : 5.4.0-132-generic
Version : #148-Ubuntu SMP Mon Oct 17 16:02:06 UTC 2022
machine : x86_64
processor : x86_64
byteorder : little
LC_ALL : None
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8
pandas : 1.5.2
numpy : 1.22.4
pytz : 2022.1
dateutil : 2.8.2
setuptools : 61.2.0
pip : 22.3.1
Cython : 0.29.30
pytest : 7.1.2
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : 3.1.2
IPython : 8.5.0
pandas_datareader: None
bs4 : 4.11.1
bottleneck : None
brotli :
fastparquet : None
fsspec : 2022.5.0
gcsfs : None
matplotlib : 3.5.2
numba : 0.56.0
numexpr : 2.8.3
odfpy : None
openpyxl : 3.0.10
pandas_gbq : None
pyarrow : 6.0.1
pyreadstat : None
pyxlsb : None
s3fs : None
scipy : 1.7.3
snappy : None
sqlalchemy : 1.4.27
tables : 3.7.0
tabulate : 0.8.10
xarray : 2022.10.0
xlrd : 2.0.0
xlwt : None
zstandard : None
tzdata : None

MarcoGorelli wrote this answer on 2022-11-23

thanks for the report - to expedite resolution, could you put a descriptive title please?

erezinman wrote this answer on 2022-11-23

LOL, sorry. Right away.

MarcoGorelli wrote this answer on 2022-11-23

this is correct, the string doesn't contain __

if you'd used _ you'd have got a multiindex

In [7]: import pandas as pd
   ...: pd.Index([f'{i}_{i}' for i in range(10)]).str.split('_', expand=True)
Out[7]:
MultiIndex([('0', '0'),
            ('1', '1'),
            ('2', '2'),
            ('3', '3'),
            ('4', '4'),
            ('5', '5'),
            ('6', '6'),
            ('7', '7'),
            ('8', '8'),
            ('9', '9')],
           )

closing for now then, but thanks for the report

MarcoGorelli wrote this answer on 2022-11-23

Should return a single-level multi-index

sorry, you're right, in the Series case it does indeed return a single-column DataFrame:

In [11]: import pandas as pd
    ...: pd.Series([f'{i}_{i}' for i in range(10)]).str.split('__', expand=True)
Out[11]:
     0
0  0_0
1  1_1
2  2_2
3  3_3
4  4_4
5  5_5
6  6_6
7  7_7
8  8_8
9  9_9
erezinman wrote this answer on 2022-11-23

Why of course it doesn't contain that, but the returned type should not be affected by that fact

ramvikrams wrote this answer on 2022-11-29

I would like to take up this issue, but just have to ask you one thing is the change to be made in pandas/core/strings/accessor.py if I am correct.

More Details About Repo
Owner Name pandas-dev
Repo Name pandas
Full Name pandas-dev/pandas
Language Python
Created Date 2010-08-24
Updated Date 2022-12-07
Star Count 36164
Watcher Count 1118
Fork Count 15472
Issue Count 3683

YOU MAY BE INTERESTED

Issue Title Created Date Comment Count Updated Date
Magepack breaks the checkout page 0 2021-03-15 2022-11-29
Liste de souhaits / Road map : fusion 4 2022-01-03 2022-10-31
Enabling Periodic Notes force Calendar plugin to Ignore Daily Notes Obsidian Plugin settings 5 2021-03-12 2022-05-16
[question] How to set the email of the instance? 7 2022-04-24 2022-10-22
Check $TERMINAL and/or $TERMCMD 3 2020-11-20 2022-10-11
Output the selected item in nwgdmenu 1 2020-12-05 2022-04-06
Launcher are in windows on river 6 2020-11-27 2022-10-11
.desktop description doesn't create a line break when overflows in nwggrid. 4 2021-01-10 2022-10-11
Feature Proposal: Kubernetes Best Practices Enforcement Library 0 2022-05-05 2022-11-15
no macOS disk image is present in the source distribution 10 2021-01-05 2022-10-11
Remove an unnecessary null pointer check 1 2020-12-28 2022-10-11
reserved identifier violation 3 2020-12-28 2022-10-11
[FEATURE REQUEST] Turn Wood Stacks and Stone Piles Into Chests 1 2021-12-15 2022-10-07
Beatbump for YouTube music 2 2022-03-13 2022-11-06
https://github.com/emmanuelhashy/tik-tok-clone.git 0 2022-04-08 2022-11-06
Check Malware/Phishing lists 4 2013-11-07 2022-10-19
Question - Explanation of Format Commands 1 2021-08-16 2022-09-30
E5108: Error executing lua Vim:E117: Unknown function: CocAction 2 2021-10-06 2022-09-27
Firebase integration 7 2020-03-11 2022-11-28
Add To Homescreen popup prompt disappeared 1 2020-04-06 2022-11-28
Gtk/OpenGL backend crashes on nVidia 460 series 19 2021-01-11 2022-08-03
Veldred worth-it to go for? 1 2020-11-18 2022-11-07
Vulkan on Linux shows a black viewport and has DPI issues 1 2021-01-11 2022-11-11
Email validator is error on a valid email. 2 2021-09-01 2022-11-11
Stuck Downloading Mods at 0% 13 2022-01-26 2022-11-20
Please add new tracker 1 2021-04-27 2021-11-07
new tracker 1 2021-04-08 2021-10-14
[Feature Request] Replace deprecated StringUtils 0 2022-07-11 2022-10-06
[kube-prometheus-stack] PVC CreateContainerConfigError 6 2022-06-06 2022-10-19
[Feature Request] Read-only secondary user 7 2021-08-11 2022-07-26
Scans are not working 0 2021-07-13 2022-11-14
change/edit the reward fee 1 2021-02-03 2022-11-24
Chrome 104 之後,HLS.js 無法顯示字幕 5 2022-08-10 2022-11-29
Save video progresse 6 2022-08-05 2022-11-29
ImportError: cannot import name 'soft_unicode' from 'markupsafe' 0 2022-03-28 2022-11-24
Plug-in / hook to generate other artifacts (helm charts) etc 5 2022-02-18 2022-12-05
[bug]Dwarf Error: Cannot handle DW_FORM_indirect in updating debug_abbrev 11 2021-10-18 2022-11-26
"Download" and "Getting Started" pages describe disjoint realities 9 2020-11-24 2022-11-03
Observer is failing to match when using negated term 2 2022-01-18 2022-07-04
Microsoft.Authorization PIM API Documents Need Updating 4 2022-05-15 2022-10-05
Meaningless print information 0 2022-08-03 2022-09-16
add checking of the length of fabric response data 0 2022-08-03 2022-08-21
BoAT User Guide/Position in The System-unclear statement 0 2022-08-03 2022-08-21
npm failing to install @design-systems 6 2021-02-24 2022-11-21
Reduce "blackness" of large code blocks 3 2021-03-08 2022-12-02
升级指南存放的目录发生变更,中文网站是否也可以同步变更 7 2020-04-14 2022-07-24
Quay.io container image is missing influx client 0 2022-01-17 2022-12-04
Add new `make dev-env` entry in Makefile 6 2022-07-11 2022-10-11
VS 2022 Add-Migration throw System.ArgumentNullException 1 2022-01-06 2022-12-05
Windows windows_home_scroll_perf__timeline_summary is 2.20% flaky 3 2022-10-05 2022-11-12