We all must have edited our photos or collage using Adobe Photoshop but not every person are handy with its image editing tools. So, what if you want a collage to be cropped, or a little extra pop up on your selfie, but are not getting through its editing tools? How it would sound if you get it right just using your voice.
Sounds so amazing, isn’t it? Yes the creators of ADOBE PHOTOSHOP are working on a designing a digital assistant to understand the aspiration and workflow of its user in creative perspective just like Google assistant or Apple Siri but one that creates art.
The Photoshop‘s digital assistant on verbal instruction by the user can carry out various editing work like flipping photos, adjusting the exposure, cropping the photos.
Recently the Adobe has released a video that demonstrates how the digital assistant works by voice commands. The video features a user giving software voice commands to do a little more than just flipping an already edited photo, cropping it and posting in on the Facebook.
This only first step was taken by Adobe to promote publicly and realize the existence of voice controlled digital assistant. Theconcept is still in the developing stage and many more advanced feature and cutting edge feature are to be developed for this “Interactive agent for photo editing” as said by the Adobe in this video.
Apart from the digital assistant for photo editing, the creators have also demonstrated the tech that helps you to edit the recorded speech by altering what a person has told or creating a new sentence from their voice. It appears inevitable but gradually be referred as Photoshop but for an audio.
The tech dubbed a voice conversion presenting a user with a text box that contains the spoken content in the audio clip recorded. You can move the words around or delete some fragments or type entirely new sentences.
When you type a new word, a small pause comes while the word is being constructed and then you can play and listen to the new audio clip.
The voice conversion (VoCo) works by infusing a huge amount of voice data for about 20 minutes right now, but will be improved and breaking it down into distinct sounds called phonemes and then tries to formulate a voice model in the speakers voice, like stresses, quirks, etc. but not much detail is given by the Adobe yet.
And when you edit someone’s speech, the voice conversion (VoCo) finds the word somewhere within that 20 minute or else constructs it out of raw phonemes. If you watch the audio clip at around 4:30, you can hear the phrase “three times” is constructed from scratch.If listen properly, it sounds a bit synthetic but not awful.
Adobe demonstrated VoCo at Adobe Max 2016 where the company usually showcase a new tech that is a year or two before commercialization.
If it really makes out of its prototype stage, it could be an addition to ADOBE AUDITION where you can edit voiceover or podcasts and also create funny and hilarious audio clips of the celebrities that can be shared on Reddit.