Bae Joonhyung – Breathing with the Chaos: Berlin Edition

Title

Rokkaku No No No (六角の脳、NO。)

Type

SNS/Web

배준형은 인공지능 연구와 예술 실천을 통해 기술을 설명의 대상에서 체험의 대상으로 전환한다. 그는 서사가 창발하는 규칙과 환경을 설계하고, 체화와 놀이를 방법론으로 삼아 관객이 스스로 의미를 발견하도록 초대한다.

Joonhyung BAE transforms technology from something to be explained into something to be experienced, through AI research and artistic practice. He designs the rules and environments from which narratives emerge, employing embodiment and play as methodology to invite audiences to discover meaning on their own.

제목의 “No”는 세 개의 동음이의어—の(소유), 脳(뇌), NO(부재)—를 하나의 발음 안에 겹쳐 놓은 것이다. 어떤 순서로 배열하든 “노, 노, 노”로 발음되지만, 의미는 그때마다 달라진다. 작가가 제안하는 최초의 배열은 “六角の脳、NO”—록카쿠의(の) 뇌(脳), 그러나 아니다(NO). 여기에 보이는 것은 작가의 뇌 그 자체가 아니라, AI 모델이 만들어낸 하나의 재현이라는 선언이다.

작업의 기본 개념과 출발점

2015년 구글의 DeepDream이 신경망의 내부를 처음 시각화했을 때, 그것은 모델 자체의 환각이었다—모델이 학습한 모든 이미지의 소음이 뒤엉킨 몽환적 패턴. Rokkaku No No No는 근본적으로 다른 질문을 던진다: 특정 작가의 시각 언어만을 학습한 AI 모델의 내부를 열어, 이 모델이 록카쿠를 어떻게 기억하고 있는가를 시각적 형태로 꺼내 보여준다.

이를 위해 본 작업은 새로운 시각화 알고리즘을 설계하였다. 기존의 특징 시각화가 완성된 이미지 위에 패턴을 겹겹이 덧씌우는 방식이었다면, 본 작업의 “확산 유도 특징 시각화(Diffusion-Guided Feature Visualization)”는 이미지가 만들어지는 과정 자체—노이즈에서 형상이 서서히 떠오르는 확산 모델의 생성 과정—안에 개입하여, 특정 뉴런이 선호하는 패턴이 자연스럽게 결상(結像)되도록 유도한다. 이렇게 생성된 500개의 시각적 표상을 3차원 공간에 띄우고, 관객이 맨손 제스처로 탐색하는 인터랙티브 웹 작업이다.

이번 전시 데이터와의 연결 방식

아야코 록카쿠(六角彩子)가 긴 시간 축적해 온 자신의 회화 데이터를 이번 프로젝트를 위해 공공에 개방한 것이 프로젝트의 출발점이다. 기성 작가가 자신의 작품 이미지를 AI 학습용 데이터셋으로 허락하는 일은 극히 드물며, 록카쿠의 이 결정이 프로젝트를 가능하게 했다.

Breathing with the Chaos 전시의 회화에서 추출한 500개의 패치를 DreamBooth로 학습시키고, 록카쿠의 조각 작품은 Meshy AI로 3D 메쉬로 변환하여 웹 공간 속 코어 오브젝로 위치시켰다. 록카쿠의 인터뷰 오디오가 웹 공간을 채워 관객에게 “작가의 뇌 속에 들어와 있다”는 감각을 부여한다.

주목한 요소

록카쿠의 회화를 처음 볼 때, 손끝으로 밀어낸 물감 덩어리들이 구름 같기도 하고, 누군가의 뇌를 들여다보는 것 같기도 했다. 붓 없이 손가락에서 캔버스로 직접 전이된 물질의 흔적들이 유기적 구조를 이루고 있었기 때문이다. 그 인상이 이 작업의 예술적 기획이 되었다—이 데이터셋으로 그녀의 “뇌”를 구현해볼 수 있지 않을까. 특히 주목한 요소는 다음과 같다:

고채도 핑크와 하늘색의 충돌을 극단화한 색면
손가락으로 밀어낸 듯한 둥근 번짐의 텍스처
“눈이 크다”는 형태적 원형 자체를 생성하는 뉴런
조각의 촉각성—회화의 촉각성이 3차원으로 번역된 결과물

전시를 해석하고 번역한 방식

대부분의 생성 AI 작업이 “록카쿠 스타일의 새 이미지”를 만드는 데 관심이 있다면, 이 작업은 이미지를 만들지 않는다. 모델 내부에 형성된 기억의 구조를 꺼내 보여준다. 완성된 그림이 아니라, 작가의 시각 언어를 이루는 최소 단위들이 신경망 안에서 어떤 모습으로 존재하는지를 드러내는 것이다.

Breathing with the Chaos 전시에서 록카쿠의 조각은 공간 중앙에 놓이고, 회화는 벽면을 따라 전시된다. 조각은 회화의 촉각성이 3차원으로 번역된 결과물이라 생각한다. 이에 조각은 작가의 방법론이 담긴 코어로, 뇌 속에 부유하는 사물로 설정했다.

500개의 뉴런 버블이 조각 오브젝 주위를 감싸고 부유하며, 화면을 가로지르는 새들은 록카쿠 회화의 최소 단위—손가락 한 번의 터치로 만들어지는 형태적 원형—을 상징한다. 뒰엉키는 버블들은 그녀가 에너지를 느끼며 그림을 완성해나가는 방법론을 유비한다.

해당 형식을 선택한 이유

웹 기반 인터랙티브 작업으로 구현한 것은 “데이터를 여는 작가, 도구를 여는 작가”라는 태도의 직접적 귀결이다. 록카쿠가 데이터를 열어 이 프로젝트를 가능하게 했듯, 나는 도구를 열어 다음 프로젝트를 가능하게 하고 싶다.

록카쿠는 붓을 사용하지 않고 맨손으로 아크릴 물감을 캔버스에 직접 바른다. “붓 NO”가 “마우스 NO, 컨트롤러 NO”로 번역된다: 웹캠이 활성화되면 관객의 손이 인식되고, 엄지와 검지를 모아 핀치하면 줌인, 손을 벌리면 공간이 회전한다. 맨손으로 그리는 작가와 맨손으로 들여다보는 관객이 촉각적 대칭을 이루는 것이다.

동일한 파이프라인에서 서예가의 획, 추상화가의 색면, 판화가의 잉크 번짐에 각각 전혀 다른 뉴런이 반응할 것이다. No No No는 이 차이들을 하나씩 축적해 나가는 프로젝트이다.

1단계: 작가 특화 뉴런 탐색
Stable Diffusion 1.5에 Breathing with the Chaos 전시 회화에서 추출한 500개의 패치를 DreamBooth로 학습시켰다. 학습이 끝나면 동일한 모델이 두 벌 존재한다—록카쿠를 모르는 원본 모델과, 록카쿠를 학습한 모델. 같은 이미지를 양쪽에 통과시킨 뒤 뉴런의 반응 강도를 비교한다. 학습 전에는 잠잠하다가 학습 후 강하게 반응하는 뉴런—그것이 “록카쿠 때문에 깨어난 뉴런”, 즉 작가 특화 뉴런이다.

2단계: 확산 유도 특징 시각화
확산 모델이 노이즈에서 이미지를 만들어내는 디노이징 과정의 매 단계마다 타겟 뉴런의 반응을 증폭하는 신호를 끼워 넣는다. DeepDream이 완성된 사진 위에 낙서를 덧칠하는 것이라면, 이 방법은 암실의 현상액에 개입하여 특정 형상이 저절로 떠오르게 하는 것이다. 모델의 자연스러운 이미지 생성 능력이 유지되면서도, 타겟 뉴런의 선호 패턴이 자연스럽게 결상된다.

3단계: Latent Amplification
2단계의 결과는 자연스럽지만, 모델의 이미지 생성 본능이 뉴런의 순수한 선호를 다소 억제한다. Latent Amplification은 잠재 공간(latent space) 안에서 뉴런의 선호 신호만을 선택적으로 증폭하는 후처리 단계이다.

증폭은 옥타브 스케줄에 따라 단계적으로 진행된다: 낮은 해상도(256px)에서 윤곽을, 중간 해상도(384px)에서 구조를, 높은 해상도(512px)에서 질감을 키운다. 오케스트라가 피아니시모에서 포르티시모로 하나의 프레이즈를 선명하게 드러내듯, 뉴런의 선호 패턴이 해상도가 올라갈 때마다 강화된다.

500개의 패치 각각에 이 과정을 거치면 증폭 이미지(amplified image)가 생성되며, 이것이 3D 버블의 텍스처가 된다. 이미지들 사이의 유사도에 따라 3차원 좌표(PCA)가 계산되어 버블의 위치가 결정된다.

4단계: 3D 공간 구성 및 인터랙션
록카쿠의 조각 사진을 Meshy AI로 3D 메쉬로 변환하고 리토폴로지를 거치 경량화한 오브젝를 웹 공간의 여러 곳에 배치하고, 500개의 뉴런 버블이 그 주위를 감싸고 부유한다. MediaPipe Hands를 통해 웹캔 기반 맨손 제스처 인식을 구현하고, Three.js로 3D 렌더링 및 실시간 탐색을 구현했다. 특정 구체를 가리키면 뉴런이 생성한 이미지가 화면을 채우고, 잠시 후 록카쿠의 원본 패치로 부드럽게 전환된다—“AI가 기억한 것”과 “작가가 실제로 그린 것” 사이의 간극을 직접 경험하게 된다.

이미지/영상 생성 방식
시각적 표상 생성: 500개의 뉴런이 생성한 이미지들은 록카쿠의 어떤 특정 작품과도 닮지 않았지만, 그녀의 시각 언어를 아는 사람이라면 “이것은 록카쿠적이다”라고 즉시 느끼게 하는 추상적 요소들을 담고 있다. 이것들은 록카쿠의 “시각적 음소”이다. 3차원 공간에서 색채, 텍스처, 형태 뉴런끼리 군집하며 하나의 지형을 형성한다—록카쿠의 시각 언어가 분해된 지도인 셈이다.

출품 영상: 약 6분 분량의 웹 캡처 영상이다. 조각 오브젝와 부유하는 버블의 전경에서 시작하여, 손 제스처로 공간을 탐색하는 과정을 실시간 캡처한다. 버블 선택 시 뉴런 이미지에서 원본 패치로 전환되는 순간을 반복적으로 보여주며, 록카쿠의 인터뷰 음성이 배경에 흐른다.

편집 및 후반 작업 과정
3D 변환 및 경량화: 록카쿠의 조각 사진을 Meshy AI로 3D 메쉬로 변환한 후 리토폴로지를 거치 웹 실시간 렌더링용으로 경량화했다.

공간 배치: 500개 뉴런 이미지의 잠재 벡터를 PCA로 3차원 좌표로 투영하여 버블 위치를 결정하고, 유클리드 거리를 존중하는 공전 애니메이션을 적용했다.

인터랙션 구현: MediaPipe Hands로 웹캠 기반 맨손 제스처를 인식하고, Three.js로 3D 렌더링 및 실시간 탐색을 구현했다. 웹캔이 없는 사용자를 존중하여 마우스로도 동일한 탐색이 가능하도록 구현했다.

설치 요건: 웹 기반 작업으로 별도 소프트웨어 없이, 웹캠과 충분한 성능의 PC(Intel Core i7-10700 / NVIDIA GeForce RTX 3080 동등 이상)만 갖추면 전시가 가능하다.

Rokkaku No No No

The title “No” layers three homonyms—の (possession), 脳 (brain), and NO (negation)—within a single phonetic form. Regardless of their order, they are all pronounced “no, no, no,” yet their meanings shift with each configuration. The artist’s initial proposition—“六角の脳、NO”—translates to “Rokkaku’s (の) brain (脳), yet not (NO).” What is presented here is not the artist’s brain itself, but a declaration that what we encounter is a representation generated by an AI model.

Concept and Point of Departure

When Google’s DeepDream first visualized the inner workings of neural networks in 2015, it revealed a form of hallucination—dreamlike patterns composed of the noise of all images the model had learned. Rokkaku No No No poses a fundamentally different question: what happens when we open the interior of an AI model trained exclusively on the visual language of a specific artist? How does this model “remember” Rokkaku?

To address this, the work introduces a new visualization method: Diffusion-Guided Feature Visualization. While conventional feature visualization overlays patterns onto completed images, this method intervenes directly in the generative process of diffusion models—guiding the emergence of forms from noise so that the preferred patterns of specific neurons naturally materialize.

The resulting 500 visual representations are placed within a three-dimensional space, forming an interactive web-based environment that visitors can explore through bare-hand gestures.

Connection to the Exhibition Dataset

The project originates from Ayako Rokkaku’s rare decision to publicly release her accumulated painting data for AI training. Such openness from an established artist is exceptional and made this work possible.

Five hundred patches extracted from paintings in Breathing with the Chaos were used to fine-tune a model via DreamBooth. Meanwhile, Rokkaku’s sculptures were converted into 3D meshes using Meshy AI and positioned as core objects within the virtual space. Audio from the artist’s interviews fills the environment, creating the sensation of entering the artist’s mind.

Key Focus

At first encounter, Rokkaku’s paintings evoke forms that resemble clouds or even the interior of a brain—organic structures shaped by the direct transfer of paint from finger to canvas. This impression became the conceptual foundation of the work: could this dataset be used to construct her “brain”?

Key elements include:

The intensified collision of highly saturated pinks and sky blues
Rounded, smeared textures suggestive of finger-driven motion
Neurons that generate archetypal forms such as “large eyes”
The tactile quality of sculpture as a three-dimensional extension of painterly touch

Interpretation and Translation

While most generative AI works aim to produce “new images in Rokkaku’s style,” this work does not create images in that sense. Instead, it reveals the internal structure of memory within the model.

Rather than presenting finished images, it exposes the minimal units that constitute the artist’s visual language and how they exist within the neural network.

In the original exhibition, Rokkaku’s sculptures occupy the central space, while paintings line the walls. Here, sculpture is interpreted as the core—an embodiment of her methodology—floating within the “brain.” Five hundred neuron-bubbles surround it, while bird-like forms crossing the space symbolize the smallest painterly unit: a single touch of the finger.

Reason for Format

The choice of a web-based interactive format reflects the ethos of “opening”—just as Rokkaku opened her data, this work opens its tools.

Rokkaku paints directly with her hands—“no brush.” This becomes “no mouse, no controller.” When the webcam is activated, the viewer’s hand is recognized: pinching zooms in, while spreading the hand rotates the space. The artist who paints with bare hands and the viewer who navigates with bare hands form a tactile symmetry.

Process

Artist-Specific Neuron Identification
A Stable Diffusion 1.5 model was fine-tuned using 500 patches via DreamBooth. By comparing neuron activations between the original and fine-tuned models, neurons that respond strongly only after training are identified as “Rokkaku-specific neurons.”
Diffusion-Guided Feature Visualization
During each step of the denoising process, signals are injected to amplify the activation of target neurons. Unlike DeepDream’s surface overlays, this method intervenes in the formation process itself, allowing preferred patterns to emerge organically.
Latent Amplification
To enhance neuron-specific signals, selective amplification is applied in latent space across multiple resolutions—progressively refining contour, structure, and texture.

Each of the 500 processed outputs becomes a texture for a 3D bubble, with spatial positions determined via PCA based on similarity.

3D Space and Interaction
Rokkaku’s sculptures are converted into lightweight 3D meshes and distributed throughout the environment. Five hundred neuron-bubbles float and orbit around them. Using MediaPipe Hands and Three.js, real-time gesture-based navigation is implemented.

When a bubble is selected, the generated image expands to fill the screen, then transitions seamlessly into the original patch—allowing viewers to directly experience the gap between “what AI remembers” and “what the artist created.”

Image and Video Generation

The neuron-generated images do not replicate any specific work by Rokkaku, yet they evoke her visual language in an abstract form. These are akin to “visual phonemes”—the smallest perceptual units of her style.

Clusters of color, texture, and form form a three-dimensional topography: a decomposed map of Rokkaku’s visual language.

The exhibited video (approx. 6 minutes) is a screen capture of real-time navigation through this space, accompanied by the artist’s voice.

Post-Production and Technical Setup

3D Conversion and Optimization: Sculptures converted via Meshy AI and retopologized for real-time rendering
Spatial Mapping: PCA-based positioning with orbiting animations reflecting Euclidean distances
Interaction: MediaPipe Hands (gesture recognition) + Three.js (rendering and navigation)
Accessibility: Mouse interaction available as an alternative
Installation Requirements: Web-based; requires only a webcam and a high-performance PC (equivalent to Intel i7-10700 / NVIDIA RTX 3080 or higher)