Batch uploading videos to Vimeo
During the summer we held a conference (obviously online, due to the pandemic). We finally got around to editing all the long live stream sessions into digestable presentation videos. So now we have to upload 56 videos - each 1GB - to Vimeo. I'd rather not do that manually...
1½ years ago, I was tasked with the same problem for the Audio Mostly 2020 conference. This time it's the AI Music Creativity, so let's see what we can re-use.
Python and the API
Vimeo provides a RESTful API that allows you (or a script on your behalf) to interact with the service. There's even a Python module, to make the API usable directly from Python.
1import vimeo
2
3v = vimeo.VimeoClient(
4 token=ACCESS_TOKEN,
5 key=CLIENT_ID,
6 secret=CLIENT_SECRET
7)
8
9video_uri = v.upload(
10 'your-filename.mp4',
11 data={'name': 'Video title', 'description': '...'}
12)
Scripting for fun and ☕
The Python module is the basis for a little script I wrote back then, that would read a configuration file containing all the metadata (like filename, title, description, license,...) and upload the videos while I can go have a coffee.
The script reads one or more configuration files that contain
- general settings (like access tokens)
- common settings (like license, or visibility settings)
- specific settings (e.g. the title and description for each video)
Obtaining access credentials
Register a dummy App (https://developer.vimeo.com/api/guides/start#register-your-app) (pick a nice app name, like "iem-uploader"):
1App name: iem-uploader
2App description: Script for batch-uploading videos
3Will people besides you be able to access your app?: No
4[x] I agree that my application does not violate the Vimeo API License Agreement or the Vimeo Terms of Service
At the top of the page, you see a Client identifier.
Save that as we are going to need it later (<CLIENT_ID>).
Somewhere at the end of the page, there is a section named Client secrets.
Save the autogenerated secret (or create a new one), we will also need that (<CLIENT_SECRET>).
Then create a new Access Token (of the Authenticated (You) variety).
In the Scopes section, make sure to tick the following items:
Public(always required)PrivateUpload(will only be available once you've selected the Private scope).
Once you've clicked on Generate, a new token is created for you.
Save it now, we are going to need it (<ACCESS_TOKEN>) and once you've left the page,
you will no longer be able to see its full value.
Now save all the credentials in the default configuration file (e.g. vimeo-uploader.conf) of our script:
1[DEFAULT]
2client_id = <CLIENT_ID>
3client_secret = <CLIENT_SECRET>
4access_token = <ACCESS_TOKEN>
Note, that once you have create a Vimeo App, you can no longer delete it. The same goes for Client Secrets within your app. So go for the real thing directly, without playing around first 😉
Configuring the upload
I opted to use the .ini format for the upload,
simply because it (or rather: Python's configfile module) has interpolation method built-in,
that allows me to define some default values
(e.g. the license; or the Vimeo App credentials we created above) only once.
It is also possible to load multiple configfiles (which will be merged),
so we can store the credentials in the default vimeo-uploader.conf,
and the settings for the actual videos in another file (e.g. papers.ini),
which can then be passed via the --csv-file option to the script:
1[DEFAULT]
2vimeo.name = AM20
3vimeo.description = Audio Mostly 2020
4
5vimeo.license = by-nc-nd
6vimeo.privacy.view = unlisted
7
8## iem
9vimeo.embed.color = #1b408f
10vimeo.embed.logos.custom.active = True
11vimeo.embed.logos.custom.url = https://i.vimeocdn.com/player/478869.png?mw=100&mh=100
12vimeo.embed.logos.custom.link = https://iem.at/
13
14vimeo.embed.logos.vimeo = False
15
16[jingle.mp4]
17vimeo.name = 2nd Conference on AI Music Creativity
18vimeo.description = The video jingle for the AIMC 2021 conference
19 © 2021 Alisa Kobzar
20vimeo.license = by-nc-sa
21
22[paper1.mp4]
23vimeo.name = AIMC 2021 | A Generative Model for Creating Musical Rhythms with Deep Reinforcement Learning
24vimeo.description = © Seyed Mojtaba Karbasi et al.
25
26 as presented at the 'AIMC2021' conference - 18.-22. July 2021, Graz/Austria and online
Any key that starts with vimeo. is considered added to the data Dictionary when calling vimeo.VimeoClient.upload.
It's possible to create nested dictionaries by using several . separators.
E.g. vimeo.embed.logo = #1b408f will add {'embed' : {'logo': '#1b408f'}} to the dictionary.
Multiline-values (as the vimeo.description in the example) need to have an extra indent of one space
(so they can be distinguished from a key: value line)
The input data
The input data I received was a folder full of video files, and two .xlsx-Sheets containing tables like
| Files | Title | Authors | Performers |
|---|---|---|---|
| "concert1/piece 1.mp4" | Somebody or other | Op.1 | AIMC Orchestra |
These were saved as CSV in LibreOffice (actually as tab-separated-values),
so we can easily convert them to .ini files of our liking:
1import csv
2
3with open(outfile, "w") as ofile:
4 with open(csvfile) as ifile:
5 for data in csv.DictReader(ifile, dialect="excel-tab"):
6 ofile.write("[%s]\n" % data["File"].strip('"'))
7 ofile.write("vimeo.name = AIMC 2021 | %s\n" % data["Title"])
8 descr=[]
9 descr.append("© %s" % data["Authors"])
10 descr.append("")
11 descr.append("as presented at the 'AIMC2021' conference - 18.-22. July 2021, Graz/Austria and online")
12 if data.get("Performers"):
13 descr.append("")
14 descr.append("performance by %s\n" % data["Performers"])
15 ofile.write("vimeo.description = %s\n" % "\n ".join(descr))
16 ofile.write("\n\n")
Entries in the File column had some extra double-quotes, which we just strip away.
The Spreadsheet for the concerts contained an extra column Performers (the use of which was optional),
so we add some extra description if this field is present.
Cleanup the input data
So after we've converted to two spreadsheet files to .ini-files, everything should be working, right?
Of course, not.
The joys of unicode
My colleague who prepared the spreadsheets, works on macOS.
And macOS prefers to used fully decomposed glyphs (NFD (Normalization Form Canonical Decomposition))
for complicated glyphs.
E.g. a glyph like Å (the ångström sign) will be represented as two characters:
the base character A ("upper-case Latin Letter 'A'") followed by a diacritic ° ("ring above").
In unicode this will be spelt out as U+0041 U+030A
The files themselves were transmitted via our institutes NextCloud service.
In this process, the filenames were normalized to fully composed glyphs
(NFC (Normalization Form Canonical Composition)),
where glyphs will be represented by the fewest possible number of characters.
E.g. the glyph Å (an upper-case letter "A" with a "ring above") can be expanded to either
Å (the ångström sign, in unicode U+212B)
or Å (the Swedish letter, in unicode U+00C5).
The two representations are equivalent by definition, but that doesn't mean that they are the same.
Of course there was a file containing the name Jérôme in my dataset
(with two letters that are represented differently in NFD resp NFC).
As the INI-file uses filenames as sections, both the configuration and the filesystem must use the same encoding.
I prefer NFC though it doesn't really matter.
The problem is, that both glyphs are rendered identically, so it makes the issue hard to spot.
In anycase, I renamed the section.
Title lengths
There's a secret competition for the longest title of an academic papers. This unfortunately clashes with Vimeos restriction, that video titles must only by 128 characters long.
Uploading the videos
1find concerts/ -type f -exec ~/src/streaming/vimeo-uploader/vimeo-uploader.py -c concerts.ini --csv-file done-concerts.csv {} +
2find papers/ -type f -exec ~/src/streaming/vimeo-uploader/vimeo-uploader.py -c papers.ini --csv-file done-papers.csv {} +
Post-upload cleanup
The original idea was that each upload got a title AIMC 2021 | Bla bla bla.
Unfortunately, Vimeo simply ignored the pipe sign | (between the header and the actual title).
Rather than finding out what went wrong, I just opened all the videos in my browser
(the upload script creates a nice CSV-file with all the URLs) and edited the titles manually
with good olde Ctrl-C, Ctrl-V editing.