Recording Client
The recording clients needs to work on all major browsers on desktop and mobile including Safari and Firefox. It needs to size and scale correctly on mobile and desktop devices and, preferably it should adhere to an accessibility standard like WCAG AA. Making a good device selector is especially hard. The codecs and file formats are different between browsers, you’ll have to further process the MP4 and Webm files created by MediaRecorder. You need a warning for low/muted microphones to avoid muted recordings. You need to scale the bitrate with resolution to avoid being capped by browser defaults.
You have to navigate several browser bugs (huge frames after standby), react to browser changes (Chrome changed the behavior of the exact keyword; Chrome changed permission dialogues several times; New codecs have been rolled out) and plan for user behavior like connecting new devices, Bluetooth headsets disconnecting, switching from Wifi connections to 5G, etc.
The recording client needs to be hosted somewhere close to your users and you need a build mechanism with rollback in case you break anything. Publishing your own npm library is an option.
Transfer Mechanism
You need a transfer mechanism between the recorder and your server side infrastructure. You can use POST but that bulk uploads everything at the end. Streaming through web sockets is more elegant but it comes with it’s own complexity. You need to keep a local buffer and a server side buffer and log of what was correctly sent. If the user disconnects, you need to connect him/her to the same leaf and restart the upload or better yet, resume the upload. You need a hash checking mechanism to make sure you’ve received the same data you’ve sent.
Post Processing
Fragmented MP4 and Webm files created by browsers do not always play correctly. Both MP4 files and Webm files are missing important duration information You need to transcode everything to the container + audio and video codecs you need. Transcoding is CPU intensive if you do it yourself with ffmpeg or expensive if you use a 3rd party transcoding service. You need code to decide wether or not it’s worth re-transcoding H.264 to H.264. If someone uploads a new video, you need to move the MOOV atom at the start and make sure the duration info is there for fast start. A keyframe should be placed at the beginning so that the video does not start with a black image. You need extra logic if you want to apply a watermark, take mobile rotation into consideration, crop the video or cut it to length.
Push To Storage
You need to implement your own push to storage mechanism that pushes the recording file and other deliverables to long term storage like Amazon S3 or Cloudflare R2. Any SDK like the AWS S3 SDK needs to be kept up to date. The transfer needs to be secure and you need to store and manage the credentials in a secure way and you need the correctly scoped permissions.
Retry Mechanism
If your storage provider is down, or you mistakenly revoke your IAM’s user S3 permissions you need to implement a staggered retry mechanism that attempts to push to storage.
Monitoring
The above infrastructure needs to be monitored using a 3rd party monitoring service that checks PING but also deeper application level check that verifies application availability. You need an alert channel, avoid alert fatigue and be able to react to alerts with human power.
Maintenance
All the above infrastructure needs to be maintained. You need to keep your OS, database, Nginx, Node, npm and every other library up to date. You need internal documentation, build scripts and backups. You need firewalls, DNS entries and fail2ban.