Open Access Open Access  Restricted Access Subscription or Fee Access

A Different Approach to Text to 3D Generation

Sneha N. Dessai, Rachel Dhanaraj


3D graphics designing requires knowledge of complex software and also it is a time-consuming process. But having a system that takes text as input and generates 3D scene equivalent to that text would be very useful. Thus we propose a system that automatically generates 3D equivalent of text input by identifying explicit and implicit constraints. This system also incorporates user interaction for improving the view of scene and learns automatically from user’s choice.


Explicit Constraints, Implicit Constraints, Render Scene, Scene Graph and Scene Template.

Full Text:



Angel X. Chang, Manolis Savva, and Christopher D. Manning.[2014]. “Learning spatial knowledge for text to 3D scene generation”. In Proceedings of Empirical Methods in Natural Language Processing (EMNLP) .

Matthew Fisher, Daniel Ritchie, Manolis Savva, Thomas Funkhouser, and Pat Hanrahan. 2012. “Example-based synthesis of 3D object arrangements”. ACM Transactions on Graphics (TOG).

Manolis Savva, Angel X. Chang, Gilbert Bernstein, Christopher D. Manning, and Pat Hanrahan. 2014. “On being the right scale: Sizing large collections of 3D models”. Stanford University Technical Report CSTR 2014-03.

G.A. Miller. 1995. WordNet: a lexical database for english.CACM.

S. R. Clay and J. Wilhelms[1996]. “Put: Language-Based Interactive Manipulation of Objects”. IEEE Computer Graphics and Applications, pages31–39, March1996.

Bob Coyne and Richard Sproat.[2001] “WordsEye: an automatic text-to-scene conversion system”. In Proceedings of the 28th annual conference on Computer graphics and interactive techniques.

Rose, S., Engel, D., Cramer, N., & Cowley, W. (2010). Automatic Keyword Extraction from Individual Documents. In M. W. Berry & J. Kogan (Eds.), Text Mining: Theory and Applications: John Wiley & Sons.

Michael W. Berry and Jacob Kogan Stuart Rose, Dave Engel, Nick Cramer and Wendy Cowley. “Automatic Keyword Extraction from Individual Documents”.

Lee M Seversky and Lijun Yin.[2006]. “Real-time automatic 3D scene generation from natural language voice and text descriptions”. InProceedings of the14th annual ACM international conference on Mul-timedia.

Christopher D. Manning, Mihai Surdeanu, John Bauer, Jenny Finkel, Steven J. Bethard, and David Mc-Closky. 2014. The Stanford CoreNLP natural lan-guage processing toolkit. InProceedings of the 52nd Annual Meeting of the Association for Com-putational Linguistics: System Demonstrations.

Daniel Jurafsky and James H. Martin. Speech and Language Processing. Prentice Hall, Englewood Cliffs, New Jersey 07632

R. Johansson, A. Berglund, M. Danielsson and P. Nugues,[2005] “Automatic Text-to-Scene Conversion in the Traffic Accident Domain”, The Nineteenth International Joint Conference on Artificial Intelligence, pages 1073–1078, 30 July-5 August 2005.

G. Adorni, M. D. Manzo, and F. Giunchiglia. Natural Language Driven Image Generation. In COLING 84, pages 495-500, 1984.

M. Di Manzo, G. Adorni, and F. Giunchiglia,[1986] "Reasoning about scene descriptions", IEEE Proceedings – Special Issue on Natural Language, 74(7):1013–1025

Terry Winograd. 1972. Understanding natural lan-guage.Cognitive psychology.

Yamada, A., Yamamoto, T and Ik eda, H. 1992. “Reconstructing spatial image from natural language texts.” In Proceedings of COLING 92 , 23-28. Nantes.

Tabordet, F., Pied, F and Nugues, P. 1999. “Scene visualization and animation from texts in a virtual environment, CC AI.” The Journal for the Integrated Study of Artificial Intelligence, Cognitive Science and Applied Epistemology. Special issue on visualization, 15(4): 339-349.

Mukerjee, A., Singh, M and Mishra, N. 2000. “Conceptual Description of Visual Scen es from Linguistic Models.” Journal of Image and Vision Computing. Special Issue on Conceptual Descriptions. V18.

Girish Kulkarni, Visruth Premraj, Sagnik Dhar, Siming Li, Yejin Choi, Alexander C. Berg, and Tamara L. Berg. 2011. Baby talk: Understanding and generat-ing simple image descriptions. In Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Con-ference on.

Desmond Elliott and Frank Keller. 2013. Image de-scription using visual dependency representations. In Proceedings of Empirical Methods in Natural Language Processing (EMNLP).

Margaret Mitchell, Xufeng Han, Jesse Dodge, Alyssa Mensch, Amit Goyal, Alex Berg, Kota Yamaguchi, TamaraBerg,KarlStratos,andHalDauméIII. 2012. Midge: Generating image descriptions from com-puter vision detections. InProceedings of the 13th Conference of the European Chapter of the Associa-tion for Computational Linguistics.

Zeng, X., Mehdi, Q and Goug h, N.E. 2002. “Generation of a 3D virtual story environment based on story description.” In Proceedings of 3rd SCS International Conference GAME-ON 2002, 77-85. London.

Zeng, X., Mehdi, Q and Gough, N.E. 2004. “Implementation of VRML and Java for story visualization tasks.” In Proceedings of 5th SCS International Conference on Intelligent Games and simulation, GAME-ON 2004, 122-126. Reading.

Zeng, X., Mehdi, Q and Gough, N.E. 2005. “From visual semantic parameterization to graphic visualization.” Proceedings of IEEE 9th International Conference on Information Visualisation, 133-138. London.

Zeng, X., Mehdi, Q and Gough, N.E. 2005. “3D scene creation using story–based descriptions”. Proceedings of CGAIMS’2005. 6th International Conference on Computer Games: Artificial Intelligence and Mobile Systems, 27-30 July 2005, Lousvillea, Kentucky, USA.


  • There are currently no refbacks.