
Overview
The KakaoTalk Translation Agent is a macOS application designed to bridge language barriers in real-time communication. Born from the frustration of manually translating Korean messages, this tool automatically captures, processes, and translates KakaoTalk conversations, making bilingual communication effortless and natural.
The Problem
In today's globalized world, language barriers can significantly impact personal and professional communication. For Korean-English bilingual communication, the process typically involves:
- Manually copying each Korean message
- Opening a translation service
- Pasting and translating
- Copying the translation back
- Repeating for every message
This workflow is not only time-consuming but also disrupts the natural flow of conversation. It requires constant context switching and manual intervention, making real-time communication challenging and frustrating.
The Solution
The KakaoTalk Translation Agent revolutionizes this process by implementing an automated, real-time translation pipeline. The application:
- Automatically captures new messages from KakaoTalk windows
- Processes and extracts Korean text using advanced OCR
- Translates the content to English in real-time
- Presents both original and translated text in a clean, intuitive interface
- Maintains conversation history for future reference
Key Features
1. Real-time Translation
- Instant capture and translation of new messages
- Support for both individual and group chats
- Automatic handling of message formatting and emojis
2. Smart Window Detection
- Intelligent identification of KakaoTalk windows
- Support for multiple chat windows
- Automatic window tracking and updates
3. Advanced Message Processing
- Duplicate detection using Levenshtein distance algorithm
- Context-aware translation
- Support for various message types (text, emojis, basic formatting)
4. User Experience
- Clean, modern split-view interface
- Unread message tracking
- Persistent chat history
- Quick access to previous translations
Technical Implementation
Architecture
Built using SwiftUI and following the MVVM architecture pattern, the application consists of:
Translator-Agent/
├── Views/
│ ├── ContentView.swift
│ └── ChatDetailView.swift
├── ViewModels/
│ ├── ScreenshotViewModel.swift
│ └── ChatListViewModel.swift
├── Services/
│ ├── ScreenCaptureService.swift
│ ├── OpenAIVisionService.swift
│ └── StorageService.swift
└── Models/
└── Chat.swift
Core Components
-
Screen Capture Service
- Utilizes macOS screen capture APIs
- Intelligent window region detection and cropping
- Optimized performance with CGWindowList APIs
- Handles window occlusion and partial visibility
-
OpenAI Vision Service
- Leverages OpenAI's Vision API for text extraction
- Handles Korean text recognition and English translation
- Implements smart caching for efficiency
- Supports various text formats and styles
-
Storage Service
- Manages persistent storage of chat history
- Efficient data serialization and deserialization
- Secure local storage implementation
- Optimized for quick retrieval and updates
Message Processing Flow
Technical Deep Dive
Duplicate Detection System
- Implements Levenshtein distance algorithm for message similarity
- 90% similarity threshold for duplicate detection
- Considers sender information in duplicate checking
- Handles variations in message formatting
Performance Optimizations
- Efficient window detection using CGWindowList APIs
- Smart caching of translations
- Asynchronous processing of messages
- Optimized UI updates with minimal redraws
- Memory-efficient image processing
Demo Video
Watch the KakaoTalk Translation Agent in action:

Getting Started
Requirements
- macOS 13.0 or later
- Xcode 15.0 or later
- OpenAI API key
- KakaoTalk desktop application
Installation
- Clone the repository
- Open in XcodeTEXTcode.txt
Translator-Agent.xcodeproj
- Add your OpenAI API key in the configuration
- Build and run the application
API Key Setup
Two methods are available:
-
Environment Variable (Recommended)
Bashscript.shexport OPENAI_API_KEY='your-api-key-here'
-
UserDefaults (Development Only)
Swiftmain.swiftAppConfig.setAPIKey("your-api-key-here")
Future Enhancements
The project is actively maintained with planned features including:
- Support for additional languages
- Enhanced message formatting
- Improved translation accuracy
- Additional chat platform support
- Advanced conversation analytics
Contributing
Contributions are welcome! The project is open source and available under the MIT License. Feel free to:
- Submit bug reports
- Propose new features
- Improve documentation
- Enhance existing functionality
Visit the GitHub repository to get started.