⚠️ Warning: This is a draft ⚠️
This means it might contain formatting issues, incorrect code, conceptual problems, or other severe issues.
If you want to help to improve and eventually enable this page, please fork RosettaGit's repository and open a merge request on GitHub.
{{draft task|Text between}}
;Task: Get the text in a string that occurs between a start and end delimiter. Programs will be given a search string, a start delimiter string, and an end delimiter string. The delimiters will not be unset, and will not be the empty string.
The value returned should be the text in the search string that occurs between the '''first''' occurrence of the start delimiter (starting after the text of the start delimiter) and the '''first''' occurrence of the end delimiter after that.
If the start delimiter is not present in the search string, a blank string should be returned.
If the end delimiter is not present after the end of the first occurrence of the start delimiter in the search string, the remainder of the search string after that point should be returned.
There are two special values for the delimiters. If the value of the start delimiter is "start", the beginning of the search string will be matched. If the value of the end delimiter is "end", the end of the search string will be matched.
Example 1. Both delimiters set
Text: "Hello Rosetta Code world"
Start delimiter: "Hello "
End delimiter: " world"
Output: "Rosetta Code"
Example 2. Start delimiter is the start of the string
Text: "Hello Rosetta Code world"
Start delimiter: "start"
End delimiter: " world"
Output: "Hello Rosetta Code"
Example 3. End delimiter is the end of the string
Text: "Hello Rosetta Code world"
Start delimiter: "Hello"
End delimiter: "end"
Output: "Rosetta Code world"
Example 4. End delimiter appears before and after start delimiter
Text: "</div><div style=\"chinese\">你好嗎</div>"
Start delimiter: "<div style=\"chinese\">"
End delimiter: "</div>"
Output: "你好嗎"
Example 5. End delimiter not present
Text: "<text>Hello <span>Rosetta Code</span> world</text><table style=\"myTable\">"
Start delimiter: "<text>"
End delimiter: "<table>"
Output: "Hello <span>Rosetta Code</span> world</text><table style=\"myTable\">"
Example 6. Start delimiter not present
Text: "<table style=\"myTable\"><tr><td>hello world</td></tr></table>"
Start delimiter: "<table>"
End delimiter: "</table>"
Output: ""
Example 7. Multiple instances of end delimiter after start delimiter (match until the first one)
Text: "The quick brown fox jumps over the lazy other fox"
Start delimiter: "quick "
End delimiter: " fox"
Output: "brown"
Example 8. Multiple instances of the start delimiter (start matching at the first one)
Text: "One fish two fish red fish blue fish"
Start delimiter: "fish "
End delimiter: " red"
Output: "two fish"
Example 9. Start delimiter is end delimiter
Text: "FooBarBazFooBuxQuux"
Start delimiter: "Foo"
End delimiter: "Foo"
Output: "BarBaz"
ALGOL 68
{{works with|ALGOL 68G|Any - tested with release 2.8.3.win32}}
Uses the Algol 68G specific string in string, for other compilers/interpreters, a version of string in string is here : [[ALGOL_68/prelude]].
As Algol 68 predates Unicode, the fourth example deviates from the task.
BEGIN
# some utility operators #
# returns the length of a string #
OP LENGTH = ( STRING a )INT: ( UPB a - LWB a ) + 1;
# returns the position of s in t or UPB t + 1 if s is not present #
PRIO INDEXOF = 1;
OP INDEXOF = ( STRING t, STRING s )INT:
IF INT pos; string in string( s, pos, t ) THEN pos ELSE UPB t + 1 FI;
# returns the text after s in t or "" if s is not present #
PRIO AFTER = 1;
OP AFTER = ( STRING t, STRING s )STRING:
IF INT pos = t INDEXOF s; pos > UPB t THEN "" ELSE t[ pos + LENGTH s : ] FI;
# returns the text before s in t or t if s is not present #
PRIO BEFORE = 1;
OP BEFORE = ( STRING t, STRING s )STRING:
IF INT pos = t INDEXOF s; pos > UPB t THEN t ELSE t[ : pos - 1 ] FI;
# mode to hold a pair of STRINGs for the BETWEEN operator #
MODE STRINGPAIR = STRUCT( STRING left, right );
# returns a STRINGPAIR composed of a and b (standard priority for AND) #
# with additional operators for CHARs as "a" is a CHAR denotation, #
# not a STRING of length 1 #
OP AND = ( STRING a, STRING b )STRINGPAIR: ( a, b );
OP AND = ( STRING a, CHAR b )STRINGPAIR: ( STRING(a), b );
OP AND = ( CHAR a, CHAR b )STRINGPAIR: ( STRING(a), STRING(b) );
OP AND = ( CHAR a, STRING b )STRINGPAIR: ( a , STRING(b) );
# traceing flag for BETWEEN - if TRUE, debug output is shown #
BOOL trace between := FALSE;
# returns the text of s between the delimitors specified in d #
PRIO BETWEEN = 1;
OP BETWEEN = ( STRING s, STRINGPAIR d )STRING:
BEGIN
STRING result := s;
IF left OF d /= "start" THEN result := result AFTER left OF d FI;
IF right OF d /= "end" THEN result := result BEFORE right OF d FI;
IF trace between THEN
# show debug output #
print( ( "Text: """, s, """", newline
, "Start delimiter: """, left OF d, """", newline
, "End delimiter: """, right OF d, """", newline
, "Output: """, result, """", newline
, newline
)
)
FI;
result
END # BETWEEN # ;
# test cases #
BEGIN
STRING s;
trace between := TRUE;
s := "Hello Rosetta Code world" BETWEEN "Hello " AND " world";
s := "Hello Rosetta Code world" BETWEEN "start" AND " world";
s := "Hello Rosetta Code world" BETWEEN "Hello " AND "end";
s := "</div><div style=""french"">bonjour</div>"
BETWEEN "<div style=""french"">"
AND "</div>";
s := "<text>Hello <span>Rosetta Code</span> world</text><table style=""myTable"">"
BETWEEN "<text>" AND "<table>";
s := "<table style=""myTable""><tr><td>hello world</td></tr></table>"
BETWEEN "<table>" AND "</table>";
s := "The quick brown fox jumps over the lazy other fox"
BETWEEN "quick " AND " fox";
s := "One fish two fish red fish blue fish"
BETWEEN "fish " AND " red";
s := "FooBarBazFooBuxQuux" BETWEEN "Foo" AND "Foo";
trace between := FALSE
END
END
{{out}}
Text: "Hello Rosetta Code world"
Start delimiter: "Hello "
End delimiter: " world"
Output: "Rosetta Code"
Text: "Hello Rosetta Code world"
Start delimiter: "start"
End delimiter: " world"
Output: "Hello Rosetta Code"
Text: "Hello Rosetta Code world"
Start delimiter: "Hello "
End delimiter: "end"
Output: "Rosetta Code world"
Text: "</div><div style="french">bonjour</div>"
Start delimiter: "<div style="french">"
End delimiter: "</div>"
Output: "bonjour"
Text: "<text>Hello <span>Rosetta Code</span> world</text><table style="myTable">"
Start delimiter: "<text>"
End delimiter: "<table>"
Output: "Hello <span>Rosetta Code</span> world</text><table style="myTable">"
Text: "<table style="myTable"><tr><td>hello world</td></tr></table>"
Start delimiter: "<table>"
End delimiter: "</table>"
Output: ""
Text: "The quick brown fox jumps over the lazy other fox"
Start delimiter: "quick "
End delimiter: " fox"
Output: "brown"
Text: "One fish two fish red fish blue fish"
Start delimiter: "fish "
End delimiter: " red"
Output: "two fish"
Text: "FooBarBazFooBuxQuux"
Start delimiter: "Foo"
End delimiter: "Foo"
Output: "BarBaz"
AppleScript
my text_between("Hello Rosetta Code world", "Hello ", " world") on text_between(this_text, start_text, end_text) set return_text to "" try if (start_text is not "start") then set AppleScript's text item delimiters to start_text set return_text to text items 2 thru end of this_text as string else set return_text to this_text end if if (end_text is not "end") then set AppleScript's text item delimiters to end_text set return_text to text item 1 of return_text as string set AppleScript's text item delimiters to "" end if end try set AppleScript's text item delimiters to "" return return_text end text_between
AWK
# syntax: GAWK -f TEXT_BETWEEN.AWK
BEGIN {
main("Hello Rosetta Code world","Hello "," world","1. Both delimiters set")
main("Hello Rosetta Code world","start"," world","2. Start delimiter is the start of the string")
main("Hello Rosetta Code world","Hello","end","3. End delimiter is the end of the string")
main("</div><div style=\"chinese\">???</div>","<div style=\"chinese\">","</div>",
"4. End delimiter appears before and after start delimiter")
main("<text>Hello <span>Rosetta Code</span> world</text><table style=\"myTable\">","<text>","<table>",
"5. End delimiter not present")
main("<table style=\"myTable\"><tr><td>hello world</td></tr></table>","<table>","</table>",
"6. Start delimiter not present")
main("The quick brown fox jumps over the lazy other fox","quick "," fox",
"7. Multiple instances of end delimiter after start delimiter (match until the first one)")
main("One fish two fish red fish blue fish","fish "," red",
"8. Multiple instances of the start delimiter (start matching at the first one)")
main("FooBarBazFooBuxQuux","Foo","Foo","9. Start delimiter is end delimiter")
main("Hello Rosetta Code world","start","end","10. Start and end delimiters use special values")
main("Hello Rosetta Code world","","x","11. Null start delimiter")
main("Hello Rosetta Code world","x","","12. Null end delimiter")
exit(0)
}
function main(text,sdelim,edelim,example, pos,str) {
printf("Example %s\n",example)
printf("Text: '%s'\n",text)
printf("sDelim: '%s'\n",sdelim)
printf("eDelim: '%s'\n",edelim)
if (sdelim == "" || edelim == "") {
printf("error: null delimiter\n\n")
return
}
if (sdelim == "start") {
str = text
}
else {
pos = index(text,sdelim)
if (pos > 0) {
str = substr(text,pos+length(sdelim))
}
}
if (edelim == "end") {
}
else {
pos = index(str,edelim)
if (pos > 0) {
str = substr(str,1,pos-1)
}
}
printf("Output: '%s'\n\n",str)
}
{{out}}
Example 1. Both delimiters set
Text: 'Hello Rosetta Code world'
sDelim: 'Hello '
eDelim: ' world'
Output: 'Rosetta Code'
Example 2. Start delimiter is the start of the string
Text: 'Hello Rosetta Code world'
sDelim: 'start'
eDelim: ' world'
Output: 'Hello Rosetta Code'
Example 3. End delimiter is the end of the string
Text: 'Hello Rosetta Code world'
sDelim: 'Hello'
eDelim: 'end'
Output: ' Rosetta Code world'
Example 4. End delimiter appears before and after start delimiter
Text: '</div><div style="chinese">???</div>'
sDelim: '<div style="chinese">'
eDelim: '</div>'
Output: '???'
Example 5. End delimiter not present
Text: '<text>Hello <span>Rosetta Code</span> world</text><table style="myTable">'
sDelim: '<text>'
eDelim: '<table>'
Output: 'Hello <span>Rosetta Code</span> world</text><table style="myTable">'
Example 6. Start delimiter not present
Text: '<table style="myTable"><tr><td>hello world</td></tr></table>'
sDelim: '<table>'
eDelim: '</table>'
Output: ''
Example 7. Multiple instances of end delimiter after start delimiter (match until the first one)
Text: 'The quick brown fox jumps over the lazy other fox'
sDelim: 'quick '
eDelim: ' fox'
Output: 'brown'
Example 8. Multiple instances of the start delimiter (start matching at the first one)
Text: 'One fish two fish red fish blue fish'
sDelim: 'fish '
eDelim: ' red'
Output: 'two fish'
Example 9. Start delimiter is end delimiter
Text: 'FooBarBazFooBuxQuux'
sDelim: 'Foo'
eDelim: 'Foo'
Output: 'BarBaz'
Example 10. Start and end delimiters use special values
Text: 'Hello Rosetta Code world'
sDelim: 'start'
eDelim: 'end'
Output: 'Hello Rosetta Code world'
Example 11. Null start delimiter
Text: 'Hello Rosetta Code world'
sDelim: ''
eDelim: 'x'
error: null delimiter
Example 12. Null end delimiter
Text: 'Hello Rosetta Code world'
sDelim: 'x'
eDelim: ''
error: null delimiter
C
/* * textBetween: Gets text between two delimiters */ char* textBetween(char* thisText, char* startText, char* endText, char* returnText) { //printf("textBetween\n"); char* startPointer = NULL; int stringLength = 0; char* endPointer = NULL; int endLength = 0; if (strstr(startText, "start") != NULL) { // Set the beginning of the string startPointer = thisText; } else { startPointer = strstr(thisText, startText); if (startPointer != NULL) { startPointer = startPointer + strlen(startText); } } // end if the start delimiter is "start" if (startPointer != NULL) { if (strstr(endText, "end") != NULL) { // Set the end of the string endPointer = thisText; endLength = 0; } else { endPointer = strstr(startPointer, endText); endLength = (int)strlen(endPointer); } // end if the end delimiter is "end" stringLength = strlen(startPointer) - endLength; if (stringLength == 0) { returnText = ""; startPointer = NULL; } else { // Copy characters between the start and end delimiters strncpy(returnText,startPointer, stringLength); returnText[stringLength++] = '\0'; } } else { //printf("Start pointer not found\n"); returnText = ""; } // end if the start pointer is not found return startPointer; } // end textBetween method
C++
{{trans|C#}}
#include <iostream> std::ostream& operator<<(std::ostream& out, const std::string& str) { return out << str.c_str(); } std::string textBetween(const std::string& source, const std::string& beg, const std::string& end) { size_t startIndex; if (beg == "start") { startIndex = 0; } else { startIndex = source.find(beg); if (startIndex == std::string::npos) { return ""; } startIndex += beg.length(); } size_t endIndex = source.find(end, startIndex); if (endIndex == std::string::npos || end == "end") { return source.substr(startIndex); } return source.substr(startIndex, endIndex - startIndex); } void print(const std::string& source, const std::string& beg, const std::string& end) { using namespace std; cout << "text: '" << source << "'\n"; cout << "start: '" << beg << "'\n"; cout << "end: '" << end << "'\n"; cout << "result: '" << textBetween(source, beg, end) << "'\n"; cout << '\n'; } int main() { print("Hello Rosetta Code world", "Hello ", " world"); print("Hello Rosetta Code world", "start", " world"); print("Hello Rosetta Code world", "Hello ", "end"); print("<text>Hello <span>Rosetta Code</span> world</text><table style=\"myTable\">", "<text>", "<table>"); print("<table style=\"myTable\"><tr><td>hello world</td></tr></table>", "<table>", "</table>"); print("The quick brown fox jumps over the lazy other fox", "quick ", " fox"); print("One fish two fish red fish blue fish", "fish ", " red"); print("FooBarBazFooBuxQuux", "Foo", "Foo"); return 0; }
{{out}}
text: 'Hello Rosetta Code world'
start: 'Hello '
end: ' world'
result: 'Rosetta Code'
text: 'Hello Rosetta Code world'
start: 'start'
end: ' world'
result: 'Hello Rosetta Code'
text: 'Hello Rosetta Code world'
start: 'Hello '
end: 'end'
result: 'Rosetta Code world'
text: '<text>Hello <span>Rosetta Code</span> world</text><table style="myTable">'
start: '<text>'
end: '<table>'
result: 'Hello <span>Rosetta Code</span> world</text><table style="myTable">'
text: '<table style="myTable"><tr><td>hello world</td></tr></table>'
start: '<table>'
end: '</table>'
result: ''
text: 'The quick brown fox jumps over the lazy other fox'
start: 'quick '
end: ' fox'
result: 'brown'
text: 'One fish two fish red fish blue fish'
start: 'fish '
end: ' red'
result: 'two fish'
text: 'FooBarBazFooBuxQuux'
start: 'Foo'
end: 'Foo'
result: 'BarBaz'
C#
{{trans|D}}
using System; namespace TextBetween { class Program { static string TextBetween(string source, string beg, string end) { int startIndex; if (beg == "start") { startIndex = 0; } else { startIndex = source.IndexOf(beg); if (startIndex < 0) { return ""; } startIndex += beg.Length; } int endIndex = source.IndexOf(end, startIndex); if (endIndex < 0 || end == "end") { return source.Substring(startIndex); } return source.Substring(startIndex, endIndex - startIndex); } static void Print(string s, string b, string e) { Console.WriteLine("text: '{0}'", s); Console.WriteLine("start: '{0}'", b); Console.WriteLine("end: '{0}'", e); Console.WriteLine("result: '{0}'", TextBetween(s, b, e)); Console.WriteLine(); } static void Main(string[] args) { Print("Hello Rosetta Code world", "Hello ", " world"); Print("Hello Rosetta Code world", "start", " world"); Print("Hello Rosetta Code world", "Hello ", "end"); Print("</div><div style=\"chinese\">你好嗎</div>", "<div style=\"chinese\">", "</div>"); Print("<text>Hello <span>Rosetta Code</span> world</text><table style=\"myTable\">", "<text>", "<table>"); Print("<table style=\"myTable\"><tr><td>hello world</td></tr></table>", "<table>", "</table>"); Print("The quick brown fox jumps over the lazy other fox", "quick ", " fox"); Print("One fish two fish red fish blue fish", "fish ", " red"); Print("FooBarBazFooBuxQuux", "Foo", "Foo"); } } }
D
import std.algorithm.searching; import std.stdio; import std.string; string textBetween(string source, string beg, string end) in { assert(beg.length != 0, "beg cannot be empty"); assert(end.length != 0, "end cannot be empty"); } body { ptrdiff_t si = source.indexOf(beg); if (beg == "start") { si = 0; } else if (si < 0) { return ""; } else { si += beg.length; } auto ei = source.indexOf(end, si); if (ei < 0 || end == "end") { return source[si..$]; } return source[si..ei]; } void print(string s, string b, string e) { writeln("text: '", s, "'"); writeln("start: '", b, "'"); writeln("end: '", e, "'"); writeln("result: '", s.textBetween(b, e), "'"); writeln; } void main() { print("Hello Rosetta Code world", "Hello ", " world"); print("Hello Rosetta Code world", "start", " world"); print("Hello Rosetta Code world", "Hello ", "end"); print("</div><div style=\"chinese\">你好嗎</div>", "<div style=\"chinese\">", "</div>"); print("<text>Hello <span>Rosetta Code</span> world</text><table style=\"myTable\">", "<text>", "<table>"); print("<table style=\"myTable\"><tr><td>hello world</td></tr></table>", "<table>", "</table>"); print("The quick brown fox jumps over the lazy other fox", "quick ", " fox"); print("One fish two fish red fish blue fish", "fish ", " red"); print("FooBarBazFooBuxQuux", "Foo", "Foo"); }
{{out}}
text: 'Hello Rosetta Code world'
start: 'Hello '
end: ' world'
result: 'Rosetta Code'
text: 'Hello Rosetta Code world'
start: 'start'
end: ' world'
result: 'Hello Rosetta Code'
text: 'Hello Rosetta Code world'
start: 'Hello '
end: 'end'
result: 'Rosetta Code world'
text: '</div><div style="chinese">你好嗎</div>'
start: '<div style="chinese">'
end: '</div>'
result: '你好嗎'
text: '<text>Hello <span>Rosetta Code</span> world</text><table style="myTable">'
start: '<text>'
end: '<table>'
result: 'Hello <span>Rosetta Code</span> world</text><table style="myTable">'
text: '<table style="myTable"><tr><td>hello world</td></tr></table>'
start: '<table>'
end: '</table>'
result: ''
text: 'The quick brown fox jumps over the lazy other fox'
start: 'quick '
end: ' fox'
result: 'brown'
text: 'One fish two fish red fish blue fish'
start: 'fish '
end: ' red'
result: 'two fish'
text: 'FooBarBazFooBuxQuux'
start: 'Foo'
end: 'Foo'
result: 'BarBaz'
Factor
USING: combinators formatting kernel locals math
prettyprint.config sequences ;
IN: rosetta-code.text-between
:: start ( sdelim text -- n )
{
{ [ sdelim "start" = ] [ 0 ] }
{ [ sdelim text subseq-start ] [ sdelim text subseq-start sdelim length + ] }
[ text length ]
} cond ;
:: end ( edelim text i -- n )
{
{ [ edelim "end" = ] [ text length ] }
{ [ edelim text i subseq-start-from ] [ edelim text i subseq-start-from ] }
[ text length ]
} cond ;
:: text-between ( text sdelim edelim -- seq )
sdelim text start :> start-index
edelim text start-index end :> end-index
start-index end-index text subseq ;
: text-between-demo ( -- )
{
{ "Hello Rosetta Code world" "Hello " " world" }
{ "Hello Rosetta Code world" "start" " world" }
{ "Hello Rosetta Code world" "Hello " "end" }
{ "</div><div style=\"chinese\">你好嗎</div>" "<div style=\"chinese\">" "</div>" }
{ "<text>Hello <span>Rosetta Code</span> world</text><table style=\"myTable\">" "<text>" "<table>" }
{ "<table style=\"myTable\"><tr><td>hello world</td></tr></table>" "<table>" "</table>" }
{ "The quick brown fox jumps over the lazy other fox" "quick " " fox" }
{ "One fish two fish red fish blue fish" "fish " " red" }
{ "FooBarBazFooBuxQuux" "Foo" "Foo" }
}
[
first3 3dup text-between [
"Text: %u\nStart delimiter: %u\nEnd delimiter: %u\nOutput: %u\n\n"
printf
] without-limits ! prevent the prettyprinter from culling output
] each ;
MAIN: text-between-demo
{{out}}
Text: "Hello Rosetta Code world"
Start delimiter: "Hello "
End delimiter: " world"
Output: "Rosetta Code"
Text: "Hello Rosetta Code world"
Start delimiter: "start"
End delimiter: " world"
Output: "Hello Rosetta Code"
Text: "Hello Rosetta Code world"
Start delimiter: "Hello "
End delimiter: "end"
Output: "Rosetta Code world"
Text: "</div><div style=\"chinese\">你好嗎</div>"
Start delimiter: "<div style=\"chinese\">"
End delimiter: "</div>"
Output: "你好嗎"
Text: "<text>Hello <span>Rosetta Code</span> world</text><table style=\"myTable\">"
Start delimiter: "<text>"
End delimiter: "<table>"
Output: "Hello <span>Rosetta Code</span> world</text><table style=\"myTable\">"
Text: "<table style=\"myTable\"><tr><td>hello world</td></tr></table>"
Start delimiter: "<table>"
End delimiter: "</table>"
Output: ""
Text: "The quick brown fox jumps over the lazy other fox"
Start delimiter: "quick "
End delimiter: " fox"
Output: "brown"
Text: "One fish two fish red fish blue fish"
Start delimiter: "fish "
End delimiter: " red"
Output: "two fish"
Text: "FooBarBazFooBuxQuux"
Start delimiter: "Foo"
End delimiter: "Foo"
Output: "BarBaz"
Go
{{trans|Kotlin}}
package main import ( "fmt" "strings" ) func textBetween(str, start, end string) string { if str == "" || start == "" || end == "" { return str } s := 0 if start != "start" { s = strings.Index(str, start) } if s == -1 { return "" } si := 0 if start != "start" { si = s + len(start) } e := len(str) if end != "end" { e = strings.Index(str[si:], end) if e == -1 { return str[si:] } e += si } return str[si:e] } func main() { texts := [9]string{ "Hello Rosetta Code world", "Hello Rosetta Code world", "Hello Rosetta Code world", "</div><div style=\"chinese\">你好嗎</div>", "<text>Hello <span>Rosetta Code</span> world</text><table style=\"myTable\">", "<table style=\"myTable\"><tr><td>hello world</td></tr></table>", "The quick brown fox jumps over the lazy other fox", "One fish two fish red fish blue fish", "FooBarBazFooBuxQuux", } starts:= [9]string{ "Hello ", "start", "Hello ", "<div style=\"chinese\">", "<text>", "<table>", "quick ", "fish ", "Foo", } ends := [9]string{ " world", " world", "end", "</div>", "<table>", "</table>", " fox", " red", "Foo", } for i, text := range texts { fmt.Printf("Text: \"%s\"\n", text) fmt.Printf("Start delimiter: \"%s\"\n", starts[i]) fmt.Printf("End delimiter: \"%s\"\n", ends[i]) b := textBetween(text, starts[i], ends[i]) fmt.Printf("Output: \"%s\"\n\n", b) } }
{{out}}
Same as Kotlin entry.
Haskell
import Data.Text (Text, pack, unpack, breakOn, stripPrefix) import Data.List (intercalate) import Data.Maybe (fromMaybe) import Control.Arrow ((***)) -- TEXT BETWEEN ----------------------------------------------------------- textBetween :: (Either String Text, Either String Text) -> Text -> Text textBetween (start, end) txt = let retain sub part delim t = either (Just . const t) (sub $ part . flip breakOn t) delim in fromMaybe (pack []) (retain (stripPrefix <*>) snd start txt >>= retain (Just .) fst end) -- TESTS ------------------------------------------------------------------ samples :: [Text] samples = pack <$> [ "Hello Rosetta Code world" , "</div><div style=\"chinese\">你好吗</div>" , "<text>Hello <span>Rosetta Code</span> world</text><table style=\"myTable\">" , "<table style=\"myTable\"><tr><td>hello world</td></tr></table>" ] delims :: [(Either String Text, Either String Text)] delims = (wrap *** wrap) <$> [ ("Hello ", " world") , ("start", " world") , ("Hello", "end") , ("<div style=\"chinese\">", "</div>") , ("<text>", "<table>") , ("<text>", "</table>") ] wrap :: String -> Either String Text wrap x = if x `elem` ["start", "end"] then Left x else Right (pack x) main :: IO () main = do mapM_ print $ flip textBetween (head samples) <$> take 3 delims (putStrLn . unlines) $ zipWith (\d t -> intercalate (unpack $ textBetween d t) ["\"", "\""]) (drop 3 delims) (tail samples)
{{Out}}
"Rosetta Code"
"Hello Rosetta Code"
" Rosetta Code world"
"你好吗"
"Hello <span>Rosetta Code</span> world</text><table style="myTable">"
""
J
'''Solution:'''
textBetween=: dyad define
text=. y
'start end'=. x
start=. ''"_^:('start'&-:) start
end=. text"_^:('end'&-:) end
end taketo start takeafter text
)
'''Example Usage:'''
('Hello ';' world') textBetween 'Hello Rosetta Code world'
Rosetta Code
'''Examples:'''
Test_text=: <;._2 noun define
Hello Rosetta Code world
Hello Rosetta Code world
Hello Rosetta Code world
</div><div style=\"chinese\">你好嗎</div>
<text>Hello <span>Rosetta Code</span> world</text><table style=\"myTable\">
<table style=\"myTable\"><tr><td>hello world</td></tr></table>
The quick brown fox jumps over the lazy other fox
One fish two fish red fish blue fish
FooBarBazFooBuxQuux
)
Test_delim=: <"1 '|'&cut;._2 noun define
Hello | world
start| world
Hello |end
<div style=\"chinese\">|</div>
<text>|<table>
<table>|</table>
quick | fox
fish | red
Foo|Foo
)
Test_output=: <;._2 noun define
Rosetta Code
Hello Rosetta Code
Rosetta Code world
你好嗎
Hello <span>Rosetta Code</span> world</text><table style=\"myTable\">
brown
two fish
BarBaz
)
Test_output = Test_delim textBetween&.> Test_text
1 1 1 1 1 1 1 1 1
Java
javac textBetween.java
java -cp . textBetween "hello Rosetta Code world" "hello " " world"
public class textBetween { /* * textBetween: Get the text between two delimiters */ static String textBetween(String thisText, String startString, String endString) { String returnText = ""; int startIndex = 0; int endIndex = 0; if (startString.equals("start")) { startIndex = 0; } else { startIndex = thisText.indexOf(startString); if (startIndex < 0) { return ""; } else { startIndex = startIndex + startString.length(); } } if (endString.equals("end")) { endIndex = thisText.length(); } else { endIndex = thisText.indexOf(endString); if (endIndex <= 0) { return ""; } else { } } returnText = thisText.substring(startIndex,endIndex); return returnText; } // end method textBetween /** * Main method */ public static void main(String[] args) { String thisText = args[0]; String startDelimiter = args[1]; String endDelimiter = args[2]; String returnText = ""; returnText = textBetween(thisText, startDelimiter, endDelimiter); System.out.println(returnText); } // end method main } // end class TextBetween
JavaScript
ES5
function textBetween(thisText, startString, endString) { if (thisText == undefined) { return ""; } var start_pos = 0; if (startString != 'start') { start_pos = thisText.indexOf(startString); // If the text does not contain the start string, return a blank string if (start_pos < 0) { return ''; } // Skip the first startString characters start_pos = start_pos + startString.length; } var end_pos = thisText.length; if (endString != 'end') { end_pos = thisText.indexOf(endString,start_pos); } // If the text does not have the end string after the start string, return the whole string after the start if (end_pos < start_pos) { end_pos = thisText.length; } var newText = thisText.substring(start_pos,end_pos); return newText; } // end textBetween
ES6
{{Trans|Haskell}} Composed from a set of generic functions
(() => { 'use strict'; // TEXT BETWEEN ---------------------------------------------------------- // Delimiter pair -> Haystack -> Any enclosed text // textBetween :: (Either String String, Either String String) -> // String -> String const textBetween = ([start, end], txt) => { const retain = (post, part, delim, t) => either( d => just(const_(t, d)), // 'start' or 'end'. No clipping. d => post(part(flip(breakOnDef)(t, d))), // One side of break delim ), mbResidue = bindMay( retain( // Start token stripped from text after any break curry(stripPrefix)(start.Right), snd, start, txt ), // Left side of any break retained. curry(retain)(just, fst, end) ); return mbResidue.nothing ? ( "" ) : mbResidue.just; } // GENERIC FUNCTIONS ----------------------------------------------------- // append (++) :: [a] -> [a] -> [a] const append = (xs, ys) => xs.concat(ys); // bindMay (>>=) :: Maybe a -> (a -> Maybe b) -> Maybe b const bindMay = (mb, mf) => mb.nothing ? mb : mf(mb.just); // Needle -> Haystack -> (prefix before match, match + rest) // breakOnDef :: String -> String -> (String, String) const breakOnDef = (pat, src) => Boolean(pat) ? (() => { const xs = src.split(pat); return xs.length > 1 ? [ xs[0], src.slice(xs[0].length) ] : [src, '']; })() : undefined; // const_ :: a -> b -> a const const_ = (k, _) => k; // Handles two or more arguments // curry :: ((a, b) -> c) -> a -> b -> c const curry = (f, ...args) => { const go = xs => xs.length >= f.length ? (f.apply(null, xs)) : function () { return go(xs.concat(Array.from(arguments))); }; return go([].slice.call(args)); }; // drop :: Int -> [a] -> [a] // drop :: Int -> String -> String const drop = (n, xs) => xs.slice(n); // either :: (a -> c) -> (b -> c) -> Either a b -> c const either = (lf, rf, e) => { const ks = Object.keys(e); return elem('Left', ks) ? ( lf(e.Left) ) : elem('Right', ks) ? ( rf(e.Right) ) : undefined; }; // elem :: Eq a => a -> [a] -> Bool const elem = (x, xs) => xs.includes(x); // flip :: (a -> b -> c) -> b -> a -> c const flip = f => (a, b) => f.apply(null, [b, a]); // fst :: (a, b) -> a const fst = pair => pair.length === 2 ? pair[0] : undefined; // just :: a -> Just a const just = x => ({ nothing: false, just: x }); // Left :: a -> Either a b const Left = x => ({ Left: x }); // map :: (a -> b) -> [a] -> [b] const map = (f, xs) => xs.map(f); // min :: Ord a => a -> a -> a const min = (a, b) => b < a ? b : a; // nothing :: () -> Nothing const nothing = (optionalMsg) => ({ nothing: true, msg: optionalMsg }); // Right :: b -> Either a b const Right = x => ({ Right: x }); // show :: Int -> a -> Indented String // show :: a -> String const show = (...x) => JSON.stringify.apply( null, x.length > 1 ? [x[1], null, x[0]] : x ); // snd :: (a, b) -> b const snd = tpl => Array.isArray(tpl) ? tpl[1] : undefined; // stripPrefix :: Eq a => [a] -> [a] -> Maybe [a] const stripPrefix = (pfx, s) => { const blnString = typeof pfx === 'string', [xs, ys] = blnString ? ( [pfx.split(''), s.split('')] ) : [pfx, s]; const sp_ = (xs, ys) => xs.length === 0 ? ( just(blnString ? ys.join('') : ys) ) : (ys.length === 0 || xs[0] !== ys[0]) ? ( nothing() ) : sp_(xs.slice(1), ys.slice(1)); return sp_(xs, ys); }; // tailDef :: [a] -> [a] const tailDef = xs => xs.length > 0 ? xs.slice(1) : []; // take :: Int -> [a] -> [a] const take = (n, xs) => xs.slice(0, n); // zipWith :: (a -> b -> c) -> [a] -> [b] -> [c] const zipWith = (f, xs, ys) => Array.from({ length: Math.min(xs.length, ys.length) }, (_, i) => f(xs[i], ys[i], i)); // TESTS ----------------------------------------------------------------- // samples :: [String] const samples = [ 'Hello Rosetta Code world', '</div><div style=\'chinese\'>你好吗</div>', '<text>Hello <span>Rosetta Code</span> world</text><table style=\'myTable\'>', '<table style=\'myTable\'><tr><td>hello world</td></tr></table>' ]; // delims :: [(Either String String, Either String String)] const delims = map( curry(map)(x => elem(x, ['start', 'end']) ? ( Left(x) // Marker token ) : Right(x) // Literal text ), [ ['Hello ', ' world'], ['start', ' world'], ['Hello', 'end'], ['<div style=\'chinese\'>', '</div>'], ['<text>', '<table>'], ['<text>', '</table>'] ]); return show(2, append( map( fromTo => textBetween(fromTo, samples[0]), take(3, delims) ), zipWith( textBetween, drop(3, delims), tailDef(samples) ) ) ); })();
{{Out}}
[
"Rosetta Code",
"Hello Rosetta Code",
" Rosetta Code world",
"你好吗",
"Hello <span>Rosetta Code</span> world</text><table style='myTable'>",
""
]
jq
The implementation uses explode
to ensure arbitrary Unicode will be handled properly.
def textbetween_strings($startdlm; $enddlm):
explode
| . as $in
| (if $startdlm == "start" then 0 else ($startdlm | length) end) as $len
| (if $startdlm == "start" then 0 else index($startdlm | explode) end) as $ix
| if $ix
then $in[$ix + $len:]
| if $enddlm == "end" then .
else index($enddlm | explode) as $ex
| if $ex then .[:$ex] else . end
end
else []
end
| implode;