BUG: Plotly scatter plots no longer produce color correctly, color alias for c in 1.5

This issue has been created since 2022-11-16.

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

import pandas as pd
import plotly.express as px
df = px.data.iris()

#Works
px.scatter(df, x="sepal_width", y="sepal_length", color="species")

#doesn't work as expected in 1.5.x
df.plot.scatter(x="sepal_width", y="sepal_length", color="species")

Issue Description

Plotly (and perhaps other backends) take a color argument that is a column name to use to make distinct colors for each variable (or even continuous color.) This is very convenient compared to the c variable in the pandas backend that takes a list of colors.

In Pandas 1.5, color and c in the back end code become the same, or rather color is an alias of c:

In addition, the code pops out the color argument so that it no longer exists in kwargs so the color information (expected to be a list of columns) is now absent. Changing the line to

color = kwargs.get('color')

would leave the variable in place, but then other plotting back ends would get color and presumably crash on an expected keyword, and would have to use c instead.

I think it would be best to not alias color to c for scatter plots and let backends like plotly continue to work as they did prior to 1.5.x

Expected Behavior

Plotly color kwarg should behave as expected prior to 1.5.x and pass through a column name to the plotly backend, not remove the kwarg and alias it to the c kwarg.

Installed Versions

/usr/local/opt/miniforge3/lib/python3.10/site-packages/_distutils_hack/init.py:33: UserWarning:

Setuptools is replacing distutils.

INSTALLED VERSIONS

commit : 91111fd
python : 3.10.6.final.0
python-bits : 64
OS : Darwin
OS-release : 22.1.0
Version : Darwin Kernel Version 22.1.0: Sun Oct 9 20:14:30 PDT 2022; root:xnu-8792.41.9~2/RELEASE_ARM64_T8103
machine : arm64
processor : arm
byteorder : little
LC_ALL : None
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8

pandas : 1.5.1
numpy : 1.23.4
pytz : 2022.1
dateutil : 2.8.2
setuptools : 65.5.1
pip : 22.3.1
Cython : None
pytest : None
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : 3.1.2
IPython : 8.6.0
pandas_datareader: None
bs4 : 4.9.3
bottleneck : None
brotli :
fastparquet : None
fsspec : 2022.5.0
gcsfs : None
matplotlib : 3.5.3
numba : 0.56.3
numexpr : None
odfpy : None
openpyxl : 3.0.10
pandas_gbq : None
pyarrow : 9.0.0
pyreadstat : None
pyxlsb : None
s3fs : None
scipy : 1.9.0
snappy : None
sqlalchemy : 1.4.44
tables : None
tabulate : 0.8.10
xarray : None
xlrd : 2.0.1
xlwt : None
zstandard : None
tzdata : 2022.6

MarcoGorelli wrote this answer on 2022-11-16

That's a good point, thanks @astrowonk

It may be best to revert the PR that introduced the alias

nicolaskruchten wrote this answer on 2022-11-16

Thanks for reporting this @astrowonk and for looking into it @MarcoGorelli!

As the author of the plotly backend, I would definitely expect that the external API here be very stable... "removing" kwargs like this definitely breaks our backend, but so would adding special handling to previously-ignored kwargs. For example right now with the plotly backend, df.plot.scatter() accepts all of the kwargs that plotly.express.scatter() does, so any "upstream" modifications to the kwargs would break things: https://plotly.com/python-api-reference/generated/plotly.express.scatter.html

I had to add some code to adapt the kwarg lists for various df.plot methods here https://github.com/plotly/plotly.py/blob/master/packages/python/plotly/plotly/__init__.py#L99 (s, c, by etc) and I guess I could continue to do so but I sort of built this backend under the assumption that this stuff wouldn't change very much :) Maybe I misunderstood the contract between Pandas and plotting backends though!

MarcoGorelli wrote this answer on 2022-11-16

Hey @nicolaskruchten

Maybe I misunderstood the contract between Pandas and plotting backends though!

No no, this is totally my fault for having approved https://github.com/pandas-dev/pandas/pull/44856/files . In fact, I should've rejected the issue instead of encouraging a PR. I wasn't thinking about the plotly backend because I don't use it (I use plotly directly by importing plotly and plotly.express)

Going forwards, something I've had on the back of my mind is to suggest to the rest of the pandas dev team is to declare pandas.plotting as being in "maintenance mode" only (my personal preference would be to remove most of it completely, as I think plotly.express and seaborn do the same thing but better. There's no comparison...). I'll open an issue about that soon.

Either way, you shouldn't need to worry about breaking changes in pandas.plotting.

Regarding what to do about this specific issue, I'll look into solutions more carefully later this week

astrowonk wrote this answer on 2022-11-16

Going forwards, something I've had on the back of my mind is to suggest to the rest of the pandas dev team is to declare pandas.plotting as being in "maintenance mode" only (my personal preference would be to remove most of it completely, as I think plotly.express and seaborn do the same thing but better. There's no comparison...). I'll open an issue about that soon.

@MarcoGorelli I would just say that I have made extensive use of the pandas plotting back end syntax (albeit lately with plotly and the backend). I think it's important to keep so existing code keeps working with existing backends.

Were it to vanish, so many scripts and modules I have written would break, so whatever happens, please keep the df.plot syntax working!

MarcoGorelli wrote this answer on 2022-11-16

Thanks @astrowonk , that's a useful perspective

I was thinking more about all the custom matplotlib code in pandas, as that's what causes the most issues - it could be greatly simplified, like with an entrypoint in seaborn similar to the one in plotly. I'll flesh that out in a separate issue anyway

rhshadrach wrote this answer on 2022-11-16

declare pandas.plotting as being in "maintenance mode" only (my personal preference would be to remove most of it completely, as I think plotly.express and seaborn do the same thing but better.

I find it very useful in ad hoc data analysis to be able to do df['my_column'].plot.hist(bins=30) etc. I would be -1 on removal (wasn't entirely sure what "most of it completely" meant).

Edit:

I was thinking more about all the custom matplotlib code in pandas, as that's what causes the most issues - it could be greatly simplified, like with an entrypoint in seaborn similar to the one in plotly. I'll flesh that out in a separate issue anyway

Ah, I should have read further before posting. +1 here.

nicolaskruchten wrote this answer on 2022-11-16

Yeah my feeling is that the existing functionality can never go away... Maintenance mode sounds good to me though!

More Details About Repo
Owner Name pandas-dev
Repo Name pandas
Full Name pandas-dev/pandas
Language Python
Created Date 2010-08-24
Updated Date 2022-12-07
Star Count 36164
Watcher Count 1118
Fork Count 15472
Issue Count 3683

YOU MAY BE INTERESTED

Issue Title Created Date Comment Count Updated Date
Not working or responding 3 2022-10-23 2022-11-04
On column hiding - Updating column width on useEffect for certain requirement which makes all column visible again 1 2022-05-05 2022-11-14
Installing Moonlight on TV - link is broken 1 2021-07-04 2022-10-08
[feds] Some fedbans or maybe all dont always ban the user or dont ban a user if they arent in the group 1 2021-02-02 2022-07-18
execute bash script in background 3 2020-08-13 2022-11-24
Reconciliation interface design 9 2022-10-06 2022-11-12
Cross-sectional bootstrapping 1 2022-10-06 2022-11-12
Accessing client IP in HTTP.Messages.Request 2 2021-04-14 2022-10-08
CSS-Doodle 利用不同图形线条实现图案艺术 0 2021-03-10 2022-11-29
Double click with cool characters does not select the whole token 0 2022-02-15 2022-10-26
Allow the api to override default commands 0 2021-08-06 2022-10-26
support dune build system? (duplicate of #477) 11 2021-04-28 2022-11-21
[Feature] 能否添加对于Hysteria的支持? 0 2022-06-22 2022-11-16
Line 19 is missing .toString() 0 2021-11-09 2022-10-26
Remove lingering references to sambacry binaries 0 2022-02-03 2022-10-25
Feature: set xfrm state marks 2 2022-11-11 2022-12-03
Connection fails with both server and client behind NAT 2 2022-11-11 2022-12-03
withIAPContext freeze my custom context 5 2021-09-16 2022-10-20
[误杀]同花顺问财误杀 1 2021-07-13 2022-01-16
Model profiling without model analyzer 5 2022-10-17 2022-11-10
No longer available on Google Play Store 1 2020-02-20 2022-11-28
Add --linkonce-templates for ldc on Windows when using Dll's 0 2022-12-04 2022-12-05
TensorflowV2Classifier does not have a save method 1 2021-10-08 2022-10-13
Long Code Blocks causes content to not wrap appropriately 3 2022-10-11 2022-11-27
xdma driver 512 error 2 2020-07-08 2022-12-04
Ability to customise the registration page 2 2021-10-13 2022-10-13
[Bug] Zettlr-1.8.9-amd64.deb depends on libappindicator3-1 13 2021-09-13 2022-11-05
Better error message for P project without a test case 1 2022-03-09 2022-12-05
Android 登录执行验证码之前抛出异常 java.security.NoSuchAlgorithmException: ECDH KeyPairGenerator not available 14 2021-12-12 2022-11-15
Support `value` as a parameter to the `defaultValue` function option 0 2022-11-08 2022-11-22
`defaultValue` should take precedence over `useKeysAsDefaultValue` if present 1 2022-11-02 2022-11-22
add tests for different combinations for image verification rule 3 2022-05-01 2022-11-14
Resize volumes without pod restarting 4 2021-10-04 2022-10-06
CockroachDB Serverless: IMPORT INTO using CSV with array-column fails with syntax error 6 2022-07-19 2022-08-24
Reenabling referencial integrity fails with Ruby 3 2 2021-10-11 2022-11-20
sql/logictest: TestLogic failed 1 2022-07-19 2022-08-12
Compatibility Contract 6 2021-09-27 2022-12-03
Expose AdminEndpoint as a Kubernetes Operator 1 2021-03-19 2022-11-25
suggestion that using nodeport by default 2 2021-10-08 2022-10-24
Parsing partitions only handles partitions on the leading edge of the file path 0 2022-12-04 2022-11-29
generator integration tests fail (but the build doesn't notice a failure) because it's bound to fabric 2.2 1 2022-06-20 2022-11-24
[webpack5] --env.uglify no longer supported? 5 2021-07-12 2022-11-27
[D&D] [BUG] Timeline fields in Wizard do not work 0 2022-09-09 2022-10-31
[iOS] Add information on enabling Safari Web Inspector on iOS 3 2022-06-07 2022-10-28
MySQL migration to v0.8 fails 10 2022-01-21 2022-12-05
feat(ext/crypto): AES key generation 0 2021-08-29 2022-07-27
com.hazelcast.map.impl.mapstore.MapLoaderFailoverTest.testLoadsAll_whenInitialLoaderNodeRemovedAfterLoading 0 2022-02-07 2022-08-26
v2.6.1 release notes contain invalid link 1 2022-03-23 2022-10-25
Allowing custom reconciler host-context in React 18 0 2022-03-21 2022-10-14
Express. static invalid in PM2 production environment 7 2022-01-25 2022-11-15