25 Years of Programming
An open source source for C, C++, OWL, BASIC, MDB, XLS, DOT, and more...
Home   Projects   Up   Sitemap   Search   Blog   Forum+Chat   About Us   Privacy   Terms of Use   Feedback   FAQ   Images   Services   Payments   Humor   Music  

Online Regular Expression Search and Replace Text Editor

, , , , , , , ,

Chars  Words  Lines  Shortest  Longest 

Regex Matches:

Instructions

  1. Type or paste your starting text into the top input box.
     
  2. Click the buttons for the actions you want to perform. Each action is applied to all the text in the box.
     
  3. To do a regular expression search and replace, enter a JavaScript compatible PCRE regular expression in the Find box, your replacement text in the Replacement box3, and click RegexReplace. Regular expressions aren't always complicated. Unless you want to find variations of a pattern, Find is simply the text you want to find and Replace is the replacement text.
     
  4. RegexMatch shows a list, in the separate Regex Matches box, of all matches of the whole regex, in the order found. It doesn't alter the original text.
     
  5. RegexExec shows a preview, in the Regex Matches box, of the $0 whole-match variable and the $1-$9 backreference variables that would be created from the parenthesized expressions if you clicked RegexReplace. It doesn't alter the original text.

    Its results are different from RegexMatch because RegexExec and RegexReplace fill $0 with the whole-pattern match and then fill $1-$9 with the (possibly nested) parenthesized subpatterns in the expression, whereas RegexMatch fills $0-$9 with all the matches of the whole regex, so the different functions are reporting very different things.

    In order to show all results, RegexMatch and RegexExec always use the g flag whether checked or not.
     
  6. Undo restores the text to what it was before the last button click. Click Undo more than once to go back multiple steps. Redo moves forward again through the steps you just Undid.

Description

This editor is mainly for doing in-place Regular Expression search and replace operations, with one-click Undo for the times when it takes multiple attempts to get it right.

It also provides some one-click functions that I do often and some other functions just because they were possible.

It's not intended as a full-featured text editor, but as a way to do common useful transformations of blocks or snippets of text with a single click. As shown in the example below, it's useful for building batch files or bash scripts based on data in a list. It allows doing Perl or sed-like text transformations without the need for separate input/output files or STDIN/STDOUT piping and with the ability to quickly Undo and try again.

Example

I needed to execute the same two commands on about 80 files. To avoid repetitive typing, I first listed the files in Windows Command Prompt:

C:\TEMP>dir /b *.bas
BARGRAPH.BAS
PRINMAZE.BAS

Then I ran a RegExp search and replace. With none of the optional regex flags checked, the expression in the Find box captures all the text on each line into the numbered variable $1, and the Replacement line supplies the text of the commands with the $1 filenames embedded where needed:

Find Regexp: (.*)
Replacement: LOAD "$1"\nSAVE "$1",A

Result:

LOAD "BARGRAPH.BAS"
SAVE "BARGRAPH.BAS",A
LOAD "PRINMAZE.BAS"
SAVE "PRINMAZE.BAS",A
and about 160 more lines that I didn't have to type manually

Notes

1) RegExp Flags

i=Case-insensitive
By default, regular expression alphabetic matching is case-sensitive. A regex designed to match "g" in the text will not match "G". The i flag makes matching case-insensitive so that a regex to match "g" will match either "g" or "G".

g=Global/all. Find or Replace all occurrences of matched text, not just the first
By default, Find and Replace only applies to the first match found. Check the g option if you want all occurrences to be found or replaced.

m=Multi-line text
It is rare that you'd want to use this option. It does not mean that there are multiple lines in the input box. Rather, it means that the input box text should not be broken apart into individual lines before doing the Regular Expression Search and Replace. Instead, the input box text is processed as one long string that has newlines embedded in it (thus "multiline").

That is different from single-line mode, where the input box text is first broken apart into lines and the Search and Replace is performed on each line separately. In that case, none of the lines being processed ever contains a newline because they were all removed while the text was being split.

The m option allows searching and replacing text phrases that might span multiple lines, but the period metachar expressions "." and ".*" still don't match newline characters in this mode, so they still stop matching when a newline is encountered. Example: start.*stop will not find a phrase where the two words have any newlines between them. However, [\s\S] and [\s\S]* (meaning "either a space char or a non-space char") do match newlines, so that can be used as an alternative to .* in multi-line mode.

The m flag significantly affects how the ^ and $ anchors work. ^ matches the start of any line, either because it's the first character in the text or because it's the first character following an embedded newline. $ matches the end of any line, either just before an embedded newline, or at the very end of all the text.

2) "Lines" are determined by the presence of newline characters. The editor window wraps text to fit, so what looks like the end of a line on your screen might not be the end of a physical line. In Firefox 4, you can expand the editor window to fit your screen size.

3) As shown in the Example, the Replacement box can contain the standard backreference variables $1-$9, referring to text captured by parenthesized expressions in the Find box. The numbers correspond to the order in which the opening parentheses are encountered going left to right. You can use the RegexExec button to preview how the parenthesized expressions will be matched.

4) Sentence Case converts the entire string to lower case, then capitalizes the first letter of each paragraph, the first letter following a period and space, and each instance of the word "I". If the starting text contains proper nouns or other capitalized words, Sentence Case doesn't preserve them, so it can create more work than it saves.

5) ToCharCodes produces a comma-separated list of the decimal ASCII character codes corresponding to the entered characters: "ABC" becomes 65,66,67.

6) FromCharCodes converts a list of decimal ASCII character codes to plain text. The delimiters do not have to be commas. Any sequence of non-digits is treated as a delimiter.

7) UnescapeChars recognizes several decimal and hexadecimal formats in which ASCII characters can be encoded, (%HH  &#xHH  \xHH  &#DD) plus a few HTML entity encodings (" & ' < >  ), and converts the encoded characters to plain ASCII text, even if they are intermingled with ordinary non-encoded text.

To encode text into these formats, first use the ToCharCodes (to convert to decimal) or Hex (to convert to hexadecimal) buttons, and then replace the intervening commas or whitespace with whatever encoding prefix you want to use.

8) RegexSplit uses the JavaScript split(/RegExp/gm) statement to split all the input box text into an array, using Find Regexp as the field-separating delimiters, which are removed. Then the array is joined back together using Replacement as the new delimiter.

  • RegexSplit always uses the g and m flags. It pays attention to whether i is checked or not.
  • If Find Regexp is empty, the default delimiter is [ \t,]+ (any sequence of spaces, TABs, commas).
  • If Replacement is empty, the default new delimiter is \n to break the text onto separate lines.

Example 1: If the input box contains sentence text and both boxes are empty, RegexSplit breaks the text so each word is on a line by itself.   

Example 2: If the input box contains comma-separated values (CSV) such as 1 , 2,3,  4,5 and Find Regexp is empty and Replacement contains \t, the text is transformed to tab-separated values (TSV), a format that usually pastes into applications with fewer problems than CSV.

9) Grep separates the input text lines into two groups. It puts lines containing a match of Find Regexp in the Regex Matches output box (the lower box), and leaves behind the lines not containing any match in the original input box (the upper box). Grep only operates on whole lines of the input text. It ignores the g and m flag settings, but pays attention to whether i is checked or not.

10) The labels Find Regexp and Replacement are buttons. Click to clear. Double-click clears both. Regex flags is a button. Click to clear all, Double-click to check all.

I developed and tested the editor with Firefox 4.0.1 and Internet Explorer 8. Bug reports, feature suggestions, and questions are welcome in the Forum.

 

Valid HTML 4.01 Transitional
Yahoo! Search
Search the web Search this site
Valid CSS