PowerToys/doc/devdocs/modules/textextractor.md
Gleb Khmyznikov 725535b760
Some checks failed
Spell checking / Check Spelling (push) Has been cancelled
Spell checking / Report (Push) (push) Has been cancelled
Spell checking / Report (PR) (push) Has been cancelled
Spell checking / Update PR (push) Has been cancelled
[DevDocs] More content and restructure (#40165)
## Summary of the Pull Request
Accumulated information from internal transition about the modules
development, and reworked it to be added in dev docs. Also the dev docs
intself was restructured to be more organized. New pages was
verified by transition team.

## PR Checklist
- [x] **Dev docs:** Added/updated

---------

Co-authored-by: Zhaopeng Wang (from Dev Box) <zhaopengwang@microsoft.com>
Co-authored-by: Hao Liu <liuhao3418@gmail.com>
Co-authored-by: Peiyao Zhao <105847726+zhaopy536@users.noreply.github.com>
Co-authored-by: Mengyuan <162882040+chenmy77@users.noreply.github.com>
Co-authored-by: zhaopeng wang <33367956+wang563681252@users.noreply.github.com>
Co-authored-by: Jaylyn Barbee <51131738+Jaylyn-Barbee@users.noreply.github.com>
2025-07-01 14:27:34 +02:00

2.1 KiB

Text Extractor

Public overview - Microsoft Learn

All Issues
Bugs
Pull Requests

Overview

Text Extractor is a PowerToys utility that enables users to extract and copy text from anywhere on the screen, including inside images and videos. The module uses Optical Character Recognition (OCR) technology to recognize text in visual content. This module is based on Joe Finney's Text Grab.

How it works

Text Extractor captures the screen content and uses OCR to identify and extract text from the selected area. Users can select a region of the screen, and Text Extractor will convert any visible text in that region into copyable text.

Architecture

Components

  • EventMonitor: Handles the ShowPowerOCRSharedEvent which triggers the OCR functionality
  • OCROverlay: The main UI component that provides:
    • Language selection for OCR processing
    • Canvas for selecting the screen area to extract text from
  • Screen Capture: Uses CopyFromScreen to capture the screen content as the overlay background image

Activation Methods

  • Global Shortcut: Activates Text Extractor through a keyboard shortcut
  • LaunchOCROverlayOnEveryScreen: Functionality to display the OCR overlay across multiple monitors

Technical Implementation

Text Extractor is implemented using Windows Presentation Foundation (WPF) technology, which provides the UI framework for the selection canvas and other interface elements.

User Experience

When activated, Text Extractor displays an overlay on the screen that allows users to select an area containing text. Once selected, the OCR engine processes the image and extracts any text found, which can then be copied to the clipboard.