[Geometry] Image-to-Geometry Workflow with Dynamo & Gemini

 


Image-to-Geometry Workflow example

The video below shows the current results of our ongoing project, where we are experimenting with outputting the form of a recognized image directly into Dynamo Geometry. While we first tested automated modeling from images and colors several years ago, this latest experiment is focused on significantly simplifying and advancing that core process.

Press enter or click to view image in full size
Press enter or click to view image in full size

In the video above, it seems that the shape of the image is well read, but it can be seen that the shape is broken as it goes back. Therefore, we have updated how ai understands and defines the pattern of shape.

A subsequent video demonstrates our new, more sophisticated approach where the workflow is decoupled into three distinct stages. While these stages are logically separate, they are currently combined into a single, powerful Python node.

Press enter or click to view image in full size

Here is a breakdown of the roles within that node:

This is Python, which combines the processes of the three steps below into one.

Don’t raise the number slide by more than 5. The rules have not been organized yet, so unnecessary loading and shaping work will stop the work. Use the api key individually.

Press enter or click to view image in full size

Stage 1: The AI Analyst

  • Role: A ‘Computational Geometry Expert’ that analyzes an image and provides a detailed, human-readable description of its geometric patterns, rules, and relationships.
  • Inputs: Image file path, API Key, and the AI model to be used (Pro or Flash).
  • Process: It utilizes the advanced reasoning of the Gemini Pro model to analyze the visual information and generate text that explains the essence of the pattern.
  • Output: A detailed text description, such as the recursive construction method for a Sierpinski triangle.

Stage 2: The AI Translator

  • Role: A ‘Data Structuring Expert’ that translates the complex natural language text from Stage 1 into a strictly formatted JSON that can be machine-processed by Stage 3.
  • Inputs: The text output from Node 1, API Key, and the AI model name.
  • Process: It uses a highly constrained prompt with strong role assignment and explicit prohibitions (e.g., “Do not ask for the image”). This forces the AI to avoid the temptation of content analysis and focus solely on the ‘Text-to-JSON’ conversion task.
  • Output: A clean JSON-formatted text with keys such as id, type, and description.

Stage 3: The Dynamo Constructor

  • Role: A ‘Geometry Expert’ that receives the perfectly structured JSON ‘blueprint’ from Node 2, interprets and extracts the data within, and ‘constructs’ the actual Dynamo geometry.
  • Inputs: The JSON text output from Node 2.
  • Process: The node parses the text using json.loads(). It then uses Regular Expressions on the description field of each element to find patterns like Start-(x,y,z) and extract precise numerical coordinates. These coordinates are then used to generate the final geometry objects (Points, Lines, NurbsCurves, etc.).
  • Output: A list of the final geometry objects, visible in the Dynamo viewport.


댓글

이 블로그의 인기 게시물

Geometry test 0506 stair and routing

Structural Analysis Workflow with Dynamo and Robot