Question
How to use glob pattern to read many CSVs into one Polars data frame with pydrive2 fsspec?
If I have 2 .csv files stored locally data/file_1.csv
and data/file_2.csv
which both have the same schema, it is easy to polars-read both of them in to 1 concatenated data frame like so:
pl.read_csv('data/file_*.csv')
But if I am storing these same 2 files within Google Drive (not a GCS bucket), and I am using GDriveFileSystem
from pydrive2.fs
as my fsspec file system, I cannot find a way to make use of the glob pattern and have to read them in separately, e.g.
fs = GDriveFileSystem(ROOT_FOLDER_ID, client_id = CLIENT_ID, client_secret = CLIENT_SECRET)
dfs = []
for i in range(1, 3):
with fs.open(f'{ROOT_FOLDER_ID}/data/file_{i}.csv', 'rb') as f:
dfs += pl.read_csv(f)
df = pl.concat(dfs)
Not only does this mean I need to know and specify the amount of files and their exact file paths in advance, but the code also just feels a lot less cleaner than before.
Is there any way I can still read these multiple files with a glob path but using the fsspec file system?