Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More
Midjourney is branching out from AI image creation and editing.
David Holz, founder of the popular AI image generation startup that reportedly has more than 21 million users on its Discord server alone, jumped on a livestreaming audio “Space” on the social network X a few moments ago to test it and see if it could screenshare.
Holz was joined by X owner Elon Musk, who revealed that a native X Spaces screensharing feature would launch next week, and until then, users would have to stream their screen through another app such as Discord.
Holz went on to explain he was preparing to show off a new multiplayer collaborative worldbuilding tool from Midjourney later today, and that the service would soon launch for people to try out.
Patchwork revealed
The Midjourney account on X also posted: “We’re starting our weekly Midjourney Office hours now with a overview of the week and then we’re gonna hand it off to Max to show off our latest experimental worldbuilding tool called ‘Patchwork’”
Holz hosted another X Space using Restream and Discord wherein he invited his colleague and employee Max Kreminski, leader of Midjourney’s Storytelling Lab, to demo Patchwork. He clarified that it would be a stand alone app that would require Midjourney accounts to log into.
The tool appears to be a web-based blank white, infinite canvas with a “toolbox” on the left side of the browser screen, showing a variety of buttons labeled for “character,” “event,” “faction,” “place,” “prop,” and “random” and more tools such as “note,” “image,” “portal,” “save” and “share.” “Save” downloads a JSON file with links to all the Midjourney images created in the canvas.
To switch between worlds, the user creates a “portal,” a small black circular button.
To generate a new world, the user enters a text prompt into an editor bar at the top of the create screen and selects one or more of a set of 10 different image styles.
This then produces a new whiteboard with a bunch of new still image assets and text boxes or entities known as “scraps”, including input boxes that allow the user to prompt new images or settings that fit the initial world description, even whole new AI generated character descriptions.
In the demo livestream, the character name automatically populated with Marcus “Dizzy” Gillespie, echoing the name of a real famous jazz musician. Dragging the description into a new character image creator box produces four new AI generated images of the character.
Adding new character boxes, the user can then prompt them to create names and characteristics, as well as motivations that can spur a conflict for the basis of a story.
The user can then link characters together with lines that denote connections between them. They can also write action sequences and scene descriptions that each narrate a story. Each character can also be used in multiple images and these images gathered together with a single option.
The user can also “share” the board with others who have Midjourney logins and they can begin collaborating on it as well, apparently in realtime, with multiple cursors moving across the same shared canvas. Kreminski said only logged in users can view boards for now, but that in the future, they may be viewable by non-users. He mentioned that tabletop roleplaying groups were already using it to chart their campaigns.
He also noted that Midjourney version 7 (V7) would include a setting to allow multiple character consistency across different and new images.
Kreminski further revealed that there were at least 3 different large language models powering the application, including a fine-tuned open source one unique to Midjourney.
Ultimately, it appears to be a novel, complex, powerful, somewhat overwhelming yet compelling tool for storyboarding. I could easily see it being used by writers and film directors, game designers, comic book creators, and even live theater directors and writers.
Long term, Kreminski said there was a “very clear path in terms of escalation of the details and interactions in the worlds,” including fully immersive 3D virtual reality scenes, but that was likely years away.
The news comes as other AI researchers, startups such as Fei-Fei Li’s World Labs, and big tech companies such as Google seek to develop AI that can create 3D immersive, navigable worlds online from simple prompts or images.
More Midjourney updates coming soon
In addition, Holz said that Midjourney would launch multiple model personalization modes in the coming days.
Currently, Midjourney allows you to rate images to personalize the kinds of visuals you want to see in your generations, and fine-tune the model to your personal preferences. Now the startup will allow you to have multiple personalized versions you can toggle between.
In addition, Holz shared that Midjourney would allow users to upload and reference multiple images to “boards” to guide generations.
Furthermore, Midjourney will be introducing video models and a Midjourney V7 AI image generator early next year with increased prompt understanding, after Christmas (December 25).
Holz further revealed Midjourney is working on 3-4 new hardware projects and said the startup was “trying to branch out and become a full research lab…it may take us six months to announce all six things.”
Source link