Online Regular Expression Search and Replace Text Editor
Instructions
- Type or paste your starting
text
into the top input box.
- Click the buttons for the actions you want to perform. Each action is applied to all the
text in the box.
- To do a regular expression search and replace, enter a
JavaScript compatible PCRE regular expression in the Find box, your replacement text in the
Replacement box3, and click
RegexReplace. Regular expressions aren't
always complicated. Unless you want to find variations of a pattern, Find is simply the text
you want to find and Replace is the replacement text.
- RegexMatch shows a list, in the separate
Regex Matches box, of all matches of the whole regex, in the order found. It
doesn't alter the original text.
- RegexExec
shows a preview, in the Regex Matches box, of the $0
whole-match variable
and the $1-$9
backreference variables that would be created from the parenthesized expressions if you clicked
RegexReplace. It doesn't alter the original text.
Its results are different from RegexMatch
because RegexExec and
RegexReplace fill $0 with the whole-pattern match and
then fill $1-$9
with the (possibly nested) parenthesized subpatterns in the expression, whereas RegexMatch
fills $0-$9
with all the matches of the whole regex, so the different functions are reporting
very different things.
In order to show all results, RegexMatch and RegexExec always use the g flag whether
checked or not.
- Undo restores the text to what it was before the last button click.
Click Undo
more than once to go back multiple steps. Redo
moves forward
again through the steps you just Undid.
Description
This editor is mainly for doing in-place Regular Expression search and replace operations, with
one-click Undo for the times when it takes multiple attempts to get it
right.
It also provides some one-click functions that I do often and some other functions
just because they were possible.
It's not intended as a full-featured text editor, but as a way to do common useful
transformations of blocks or snippets of text with a single click. As shown in the example below, it's useful for
building batch files or bash scripts based on data in a list. It allows doing Perl or sed-like
text transformations without the need for separate input/output files or STDIN/STDOUT
piping and with the ability to quickly Undo and try again.
Example
I needed to execute the same two commands on about 80 files. To avoid repetitive typing,
I first listed the files in Windows Command Prompt:
C:\TEMP>dir /b *.bas
BARGRAPH.BAS
PRINMAZE.BAS
Then I ran a RegExp search and replace. With none of the optional regex flags checked, the expression in the Find box captures all the text on each line into the numbered variable $1, and the Replacement line
supplies
the text of the commands with the $1 filenames embedded where needed:
Find Regexp: (.*)
Replacement: LOAD "$1"\nSAVE "$1",A
Result:
LOAD "BARGRAPH.BAS"
SAVE "BARGRAPH.BAS",A
LOAD "PRINMAZE.BAS"
SAVE "PRINMAZE.BAS",A
and about 160 more lines that I didn't have to type manually
Notes
1) RegExp Flags
i=Case-insensitive
By default, regular expression alphabetic matching is case-sensitive. A regex designed to match "g" in the text will
not match "G". The i flag makes matching case-insensitive so that a regex to match "g" will
match either "g" or "G".
g=Global/all. Find or Replace all occurrences of matched text, not just the
first
By default, Find and Replace only applies to the first match found.
Check the g option if you want all occurrences to be found or replaced.
m=Multi-line text
It is rare that you'd want to use this option. It does not mean that there are multiple
lines in the input box. Rather, it means that the input box text should not be broken
apart into individual lines before doing the Regular Expression Search and Replace. Instead, the
input box text is processed as one long string that has newlines embedded in it (thus "multiline").
That is different from single-line mode, where the input box text is first broken apart
into lines and the Search and Replace is performed on each line separately. In that case, none
of the lines being processed ever contains a newline because they were all removed while the
text was being split.
The m option
allows searching and replacing text phrases that might span multiple lines, but the period metachar
expressions "." and ".*"
still don't match newline characters in this mode, so they still stop matching when a
newline is encountered. Example: start.*stop will not
find a phrase where the two words have any newlines between them. However, [\s\S]
and [\s\S]* (meaning "either a space char or a
non-space char") do match newlines, so that can be used as an alternative to
.* in multi-line mode.
The m flag significantly affects how the ^ and $
anchors work. ^ matches the start of any line, either because it's the first character
in the text or because it's the first character following an embedded newline. $
matches the end of any line, either just before an embedded newline, or at the very end of all
the text.
2) "Lines" are determined by the presence of newline characters. The editor window wraps text
to fit,
so what looks like the end of a line on your screen might not be the end of a physical line. In Firefox 4, you can expand
the editor window to fit your screen size.
3) As shown in the Example, the Replacement box can contain the standard backreference variables
$1-$9, referring to text captured by parenthesized expressions
in
the Find box. The numbers correspond to the order in which the opening parentheses are
encountered going left to right. You can use the RegexExec button to preview how the parenthesized
expressions will be matched.
4) Sentence Case converts the entire string to lower case, then capitalizes the first letter of
each paragraph, the first letter following a period and space, and each instance of the word "I".
If the starting text contains proper nouns or other capitalized words, Sentence Case doesn't
preserve them, so it can create more work than it saves.
5) ToCharCodes produces a comma-separated list of the decimal ASCII character codes corresponding to the entered
characters: "ABC" becomes 65,66,67.
6) FromCharCodes converts a list of
decimal ASCII character codes to plain text. The delimiters do not have to be commas. Any sequence
of non-digits is treated as a delimiter.
7) UnescapeChars recognizes several
decimal and hexadecimal formats in which ASCII characters can be encoded, (%HH
&#xHH \xHH &#DD)
plus a few HTML entity encodings (" & ' < >
), and converts the encoded characters to plain ASCII text, even if they are
intermingled with ordinary non-encoded text.
To encode text into these formats, first use the ToCharCodes (to convert to
decimal) or Hex (to convert to hexadecimal) buttons, and then replace the intervening
commas or whitespace with whatever encoding prefix you want to use.
8) RegexSplit uses the JavaScript split(/RegExp/gm)
statement to split all the input box text into an array, using Find Regexp as the field-separating delimiters,
which are removed. Then the array is joined back
together using Replacement as the new delimiter.
- RegexSplit always uses the g and m flags. It pays
attention to whether i is checked or not.
- If Find Regexp is empty, the default delimiter is
[ \t,]+ (any sequence of spaces, TABs, commas).
- If Replacement is empty, the default new delimiter is
\n to break the text onto
separate lines.
Example 1: If the input box contains sentence text and both boxes are empty,
RegexSplit breaks the text so each word is on a line by itself.
Example 2: If the input box contains comma-separated values (CSV) such as
1 , 2,3, 4,5 and Find Regexp is empty
and Replacement contains \t, the text is transformed
to tab-separated values
(TSV), a format that usually pastes into applications with fewer problems than CSV.
9) The labels Find Regexp and Replacement are buttons. Click to clear. Double-click
clears both. Regex flags is a button. Click to clear all, Double-click to check all.
I developed and tested the editor with Firefox 4.0.1 and Internet Explorer 8.
Bug reports, feature suggestions, and questions are welcome in the
Forum.
|