Reimagining Architecture: A Semiotic Study of Sound in Ai-Generated Spatial Design

Authors

DOI:

https://doi.org/10.38027/smart.v2n1-7

Keywords:

Semiotics, Architectural Theory and Criticism, Architectural Design, Artificial Intelligence, Multi-Sensory Design

Abstract

This study examines how artificial intelligence (AI) interprets spoken architectural language by analysing vocal features—such as pitch, tone, and magnitude—and translating them into visual representations. Situated within the field of architectural semiotics, the research investigates how sound functions not merely as an acoustic phenomenon but as a symbolic agent in AI-mediated design. Five ambiguous architectural terms (vault, shell, column, plan, story) were recorded in two distinct sentence contexts and vocal styles (neutral vs. expressive). Using the Librosa Python library, pitch range and vocal magnitude were extracted as prosodic features. These metrics informed the construction of emotionally nuanced text prompts for MidJourney, a generative AI model, to produce architectural images reflecting vocal delivery. The results reveal consistent correlations between vocal variation and visual form: high pitch and strong vocal energy led to expressive, fluid, and emotionally charged spaces; lower pitch and stable magnitude generated grounded, monumental, or contemplative structures. These outcomes suggest that vocal expression can serve as a semiotic input in cross-modal AI workflows, where speech acts as both data and design material.

By bridging sound and space, the study expands the semiotic framework of architectural representation and introduces voice as a generative modality in AI-assisted urban and spatial design. The findings support a multi-sensory design paradigm where not only what is said, but how it is said, shapes architectural meaning.

References

Baran, M. (2023). Artificial, Intelligent, Architecture. ORO Editions.

Barthes, R. (1977). Image, music, text (S. Heath, Trans.). Fontana Press.

Enjellina & Beyan, E. V. P., & Rossy, A. G. C. (2023). A review of AI image generator: Influences, challenges and future prospects for architectural field. Journal of Artificial Intelligence in Architecture, 2(1), 53–65. https://doi.org/10.24002/jarina.v2i1.6662

Chandler, D. (2017). Semiotics: The basics (3rd ed.). Routledge.

Coeckelbergh, M. (2023). The work of art in the age of AI image generation: Aesthetics and human technology relations as process and performance. Journal of Human-Technology Relations, 1(1). https://doi.org/10.59490/jhtr.2023.1.7025

Eco, U. (1976). A theory of semiotics. Indiana University Press.

Eco, U. (1984). Semiotics and the philosophy of language. Indiana University Press.

Frampton, K. (1995). Studies in tectonic culture: The poetics of construction in nineteenth and twentieth century architecture. MIT Press.

Gaver, W. W. (1993). What in the world do we hear? An ecological approach to auditory event perception. Ecological Psychology, 5(1), 1–29. https://doi.org/10.1207/s15326969eco0501_1

Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep learning. MIT Press.

Jencks, C. (1977). The language of post-modern architecture. Rizzoli.

Kress, G., & van Leeuwen, T. (2006). Reading images: The grammar of visual design (2nd ed.). Routledge. https://doi.org/10.4324/9780203619728

Leroi-Gourhan, A. (1993). Gesture and speech (A. Bostock Berger, Trans.). MIT Press.

McFee, B., Raffel, C., Liang, D., Ellis, D. P. W., McVicar, M., Battenberg, E., & Nieto, O. (2015). librosa: Audio and music signal analysis in Python. In Proceedings of the 14th Python in Science Conference (pp. 18–25). https://doi.org/10.25080/Majora-7b98e3ed-003

MidJourney. (2022). MidJourney AI: A generative model for creating images from text prompts.

Mikalonytė, E. S., & Kneer, M. (2022). Can artificial intelligence make art? Folk intuitions as to whether AI-driven robots can be viewed as artists and produce art. ACM Transactions on Human-Robot Interaction, 11(4), 1–19. https://doi.org/10.1145/3530875

Norberg-Schulz, C. (1980). Genius loci: Towards a phenomenology of architecture. Rizzoli.

Oxman, R., & Oxman, R. (Eds.). (2014). Theories of the digital in architecture. Wiley.

Oxman, R. (2017). Thinking difference: Theories and models of parametric design thinking. Design Studies, 52, 4–39. doi: 10.1016/j.destud.2017.06.001

Padi, S., Sadjadi, S. O., Manocha, D. & Sririam, R. D. (2022). Multimodal Emotion Recognition Using Transfer Learning from Speaker Recognition and BERT-Based Models. The Speaker and Language Recognition Workshop (Odyssey 2022), 407-414. https://doi.org/10.21437/Odyssey.2022-57

Peirce, C. S. (1958). Collected papers of Charles Sanders Peirce (Vols. 1–8, C. Hartshorne, P. Weiss, & A. W. Burks, Eds.). Harvard University Press.

Picon, A. (2010). Digital culture in architecture: An introduction for the design professions. Birkhäuser.

Radford, A., Kim, J. W., Hallacy, C., Ramesh, A., Goh, G., Agarwal, S., Sastry, G., Askell, A., Mishkin, P., Clark, J., Krueger, G., & Sutskever, I. (2021). Learning transferable visual models from natural language supervision. arXiv. https://doi.org/10.48550/arXiv.2103.00020

Sage, M. F. (2022). Architecture in high resolution. ORO Editions.

Schön, D. A. (1992). The reflective practitioner: How professionals think in action. Routledge. https://doi.org/10.4324/9781315237473

Strogatz, S. H. (2003). Sync: How order emerges from chaos in the universe, nature, and daily life. Hachette Books.

Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., & Rabinovich, A. (2015). Going deeper with convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 1–9). https://doi.org/10.1109/CVPR.2015.7298594

van Leeuwen, T. (1999). Speech, music, sound. Macmillan. https://doi.org/10.1007/978-1-349-27700-1

Vissers-Similon, E., Dounas, T., & De Walsche, J. (2024). Classification of artificial intelligence techniques for early architectural design stages. International Journal of Architectural Computing, 23(2), 387–404. https://doi.org/10.1177/14780771241260857

Downloads

Published

2025-07-21

How to Cite

Reimagining Architecture: A Semiotic Study of Sound in Ai-Generated Spatial Design. (2025). Smart Design Policies, 2(1), 107-121. https://doi.org/10.38027/smart.v2n1-7

Share

Similar Articles

1-10 of 16

You may also start an advanced similarity search for this article.