⚠️ Warning: This is a draft ⚠️

This means it might contain formatting issues, incorrect code, conceptual problems, or other severe issues.

If you want to help to improve and eventually enable this page, please fork RosettaGit's repository and open a merge request on GitHub.

{{task}} The task is to strip control codes and extended characters from a string. The solution should demonstrate how to achieve each of the following results:

• a string with control codes stripped (but extended characters not stripped)
• a string with control codes and extended characters stripped

In ASCII, the control codes have decimal codes 0 through to 31 and 127. On an ASCII based system, if the control codes are stripped, the resultant string would have all of its characters within the range of 32 to 126 decimal on the ASCII table.

On a non-ASCII based system, we consider characters that do not have a corresponding glyph on the ASCII table (within the ASCII range of 32 to 126 decimal) to be an extended character for the purpose of this task.

```with Ada.Text_IO;

procedure Strip_ASCII is

Full: String := 'a' & Character'Val(11) & 'b' & Character'Val(166) &
'c' & Character'Val(127) & Character'Val(203) &
Character'Val(202) & "de";
-- 5 ordinary characters ('a' .. 'e')
-- 2 control characters (11, 127); note that 11 is the "vertical tab"
-- 3 extended characters (166, 203, 202)

function Filter(S:     String;
From:  Character := ' ';
To:    Character := Character'Val(126);
Above: Character := Character'Val(127)) return String is
begin
if S'Length = 0 then
return "";
elsif (S(S'First) >= From and then S(S'First) <= To) or else S(S'First) > Above then
return S(S'First) & Filter(S(S'First+1 .. S'Last), From, To, Above);
else
return Filter(S(S'First+1 .. S'Last), From, To, Above);
end if;
end Filter;

procedure Put_Line(Text, S: String) is
begin
Ada.Text_IO.Put_Line(Text & " """ & S & """, Length:" & Integer'Image(S'Length));
end Put_Line;

begin
Put_Line("The full string :", Full);
Put_Line("No Control Chars:", Filter(Full)); -- default values for From, To, and Above
Put_Line("Neither_Extended:", Filter(Full, Above => Character'Last)); -- defaults for From and To
end Strip_ASCII;

```

Output:

```The full string : "a
b�c��de", Length: 10
No Control Chars: "ab�c��de", Length: 8
Neither_Extended: "abcde", Length: 5
```

## ALGOL 68

```# remove control characters and optionally extended characters from the string text  #
# assums ASCII is the character set                                                  #
PROC strip characters = ( STRING text, BOOL strip extended )STRING:
BEGIN
# we build the result in a []CHAR and convert back to a string at the end #
INT text start = LWB text;
INT text max   = UPB text;
[ text start : text max ]CHAR result;
INT result pos := text start;
FOR text pos FROM text start TO text max DO
INT ch := ABS text[ text pos ];
IF ( ch >= 0 AND ch <= 31 ) OR ch = 127 THEN
# control character #
SKIP
ELIF strip extended AND ( ch > 126 OR ch < 0 ) THEN
# extened character and we don't want them #
SKIP
ELSE
# include this character #
result[ result pos ] := REPR ch;
result pos +:= 1
FI
OD;
result[ text start : result pos - 1 ]
END # strip characters # ;

# test the control/extended character stripping procedure #
STRING t = REPR 2 + "abc" + REPR 10 + REPR 160 + "def~" + REPR 127 + REPR 10 + REPR 150 + REPR 152 + "!";
print( ( "<<" + t + ">> - without control characters:             <<" + strip characters( t, FALSE ) + ">>", newline ) );
print( ( "<<" + t + ">> - without control or extended characters: <<" + strip characters( t, TRUE  ) + ">>", newline ) )
```

{{out}}

```
<<�abc
ádef~
ûÿ!>> - without control characters:             <<abcádef~ûÿ!>>
<<�abc
ádef~
ûÿ!>> - without control or extended characters: <<abcdef~!>>

```

## AutoHotkey

{{trans|Python}}

```Stripped(x){
Loop Parse, x
if Asc(A_LoopField) > 31 and Asc(A_LoopField) < 128
r .= A_LoopField
return r
}
MsgBox % stripped("`ba" Chr(00) "b`n`rc`fd" Chr(0xc3))
```

## AWK

```
# syntax: GAWK -f STRIP_CONTROL_CODES_AND_EXTENDED_CHARACTERS.AWK
BEGIN {
s = "ab\xA2\x09z" # a b cent tab z
printf("original string: %s (length %d)\n",s,length(s))
gsub(/[\x00-\x1F\x7F]/,"",s); printf("control characters stripped: %s (length %d)\n",s,length(s))
gsub(/[\x80-\xFF]/,"",s); printf("control and extended stripped: %s (length %d)\n",s,length(s))
exit(0)
}

```

output:

```
original string: ab¢    z (length 5)
control characters stripped: ab¢z (length 4)
control and extended stripped: abz (length 3)

```

## BASIC

{{works with|QBasic}}

While DOS does support ''some'' extended characters, they aren't entirely standardized, and shouldn't be relied upon.

```DECLARE FUNCTION strip\$ (what AS STRING)
DECLARE FUNCTION strip2\$ (what AS STRING)

DIM x AS STRING, y AS STRING, z AS STRING

'   tab                c+cedilla           eof
x = CHR\$(9) + "Fran" + CHR\$(135) + "ais" + CHR\$(26)
y = strip(x)
z = strip2(x)

PRINT "x:"; x
PRINT "y:"; y
PRINT "z:"; z

FUNCTION strip\$ (what AS STRING)
DIM outP AS STRING, L0 AS INTEGER, tmp AS STRING
FOR L0 = 1 TO LEN(what)
tmp = MID\$(what, L0, 1)
SELECT CASE ASC(tmp)
CASE 32 TO 126
outP = outP + tmp
END SELECT
NEXT
strip\$ = outP
END FUNCTION

FUNCTION strip2\$ (what AS STRING)
DIM outP AS STRING, L1 AS INTEGER, tmp AS STRING
FOR L1 = 1 TO LEN(what)
tmp = MID\$(what, L1, 1)
SELECT CASE ASC(tmp)
'normal     accented    various     greek, math, etc.
CASE 32 TO 126, 128 TO 168, 171 TO 175, 224 TO 253
outP = outP + tmp
END SELECT
NEXT
strip2\$ = outP
END FUNCTION
```

Output: x: Français→ y:Franais z:Français

## BBC BASIC

```      test\$ = CHR\$(9) + "Fran" + CHR\$(231) + "ais." + CHR\$(127)
PRINT "Original ISO-8859-1 string: " test\$ " (length " ; LEN(test\$) ")"
test\$ = FNstripcontrol(test\$)
PRINT "Control characters stripped: " test\$ " (length " ; LEN(test\$) ")"
test\$ = FNstripextended(test\$)
PRINT "Control & extended stripped: " test\$ " (length " ; LEN(test\$) ")"
END

DEF FNstripcontrol(A\$) : REM CHR\$(127) is a 'control' code
LOCAL I%
WHILE I%<LEN(A\$)
I% += 1
IF ASCMID\$(A\$,I%)<32 OR ASCMID\$(A\$,I%)=127 THEN
A\$ = LEFT\$(A\$,I%-1) + MID\$(A\$,I%+1)
ENDIF
ENDWHILE
= A\$

DEF FNstripextended(A\$)
LOCAL I%
WHILE I%<LEN(A\$)
I% += 1
IF ASCMID\$(A\$,I%)>127 THEN
A\$ = LEFT\$(A\$,I%-1) + MID\$(A\$,I%+1)
ENDIF
ENDWHILE
= A\$
```

Output:

```
Original ISO-8859-1 string:  Français (length 11)
Control characters stripped: Français. (length 9)
Control & extended stripped: Franais. (length 8)

```

## Bracmat

```(  "string of ☺☻♥♦⌂, may include control
characters and other ilk.\L\D§►↔◄
Rødgrød med fløde"
: ?string1
: ?string2
& :?newString
&   whl
' ( @(!string1:?clean (%@:<" ") ?string1)
& !newString !clean:?newString
)
& !newString !string1:?newString
& out\$(str\$("Control characters stripped:
" str\$!newString))
& :?newString
&   whl
' ( @(!string2:?clean (%@:(<" "|>"~")) ?string2)
& !newString !clean:?newString
)
& !newString !string2:?newString
&   out
\$ ( str
\$ ( "
Control characters and extended characters stripped:
"
str\$!newString
)
)
& );
```

Output:

```Control characters stripped:
string of ⌂, may include controlcharacters and other ilk.§Rødgrød med fløde

Control characters and extended characters stripped:
string of , may include controlcharacters and other ilk.Rdgrd med flde
```

## C

```#include <stdio.h>
#include <stdlib.h>

#define IS_CTRL  (1 << 0)
#define IS_EXT	 (1 << 1)
#define IS_ALPHA (1 << 2)
#define IS_DIGIT (1 << 3) /* not used, just give you an idea */

unsigned int char_tbl[256] = {0};

/* could use ctypes, but then they pretty much do the same thing */
void init_table()
{
int i;

for (i = 0; i < 32; i++) char_tbl[i] |= IS_CTRL;
char_tbl[127] |= IS_CTRL;

for (i = 'A'; i <= 'Z'; i++) {
char_tbl[i] |= IS_ALPHA;
char_tbl[i + 0x20] |= IS_ALPHA; /* lower case */
}

for (i = 128; i < 256; i++) char_tbl[i] |= IS_EXT;
}

/* depends on what "stripped" means; we do it in place.
* "what" is a combination of the IS_* macros, meaning strip if
* a char IS_ any of them
*/
void strip(char * str, int what)
{
unsigned char *ptr, *s = (void*)str;
ptr = s;
while (*s != '\0') {
if ((char_tbl[(int)*s] & what) == 0)
*(ptr++) = *s;
s++;
}
*ptr = '\0';
}

int main()
{
char a[256];
int i;

init_table();

/* populate string with one of each char */
for (i = 1; i < 255; i++) a[i - 1] = i; a[255] = '\0';
strip(a, IS_CTRL);
printf("%s\n", a);

for (i = 1; i < 255; i++) a[i - 1] = i; a[255] = '\0';
strip(a, IS_CTRL | IS_EXT);
printf("%s\n", a);

for (i = 1; i < 255; i++) a[i - 1] = i; a[255] = '\0';
strip(a, IS_CTRL | IS_EXT | IS_ALPHA);
printf("%s\n", a);

return 0;
}
```

output: !"#\$%&'()+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[]^_`abcdefghijklmnopqrstuvwxyz{|}~ <odd stuff my xterm thinks are bad unicode hence can't be properly shown> !"#\$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[]^_`abcdefghijklmnopqrstuvwxyz{|}~ !"#\$%&'()+,-./0123456789:;<=>?@[]^_`{|}~

```

## C++

```cpp
#include <string>
#include <iostream>
#include <algorithm>
#include <boost/lambda/lambda.hpp>
#include <boost/lambda/casts.hpp>
#include <ctime>
#include <cstdlib>
using namespace boost::lambda ;

struct MyRandomizer {
char operator( )( ) {
return static_cast<char>( rand( ) % 256 ) ;
}
} ;

std::string deleteControls ( std::string startstring ) {
std::string noControls( "                                        " ) ;//creating space for
//the standard algorithm remove_copy_if
std::remove_copy_if( startstring.begin( ) , startstring.end( ) , noControls.begin( ) ,
ll_static_cast<int>( _1 ) < 32 && ll_static_cast<int>( _1 ) == 127 ) ;
return noControls ;
}

std::string deleteExtended( std::string startstring ) {
std::string noExtended ( "                                        " ) ;//same as above
std::remove_copy_if( startstring.begin( ) , startstring.end( ) , noExtended.begin( ) ,
ll_static_cast<int>( _1 ) > 127 || ll_static_cast<int>( _1 ) < 32 ) ;
return noExtended ;
}

int main( ) {
std::string my_extended_string ;
for ( int i = 0 ; i < 40 ; i++ ) //we want the extended string to be 40 characters long
my_extended_string.append( " " ) ;
srand( time( 0 ) ) ;
std::generate_n( my_extended_string.begin( ) , 40 , MyRandomizer( ) ) ;
std::string no_controls( deleteControls( my_extended_string ) ) ;
std::string no_extended ( deleteExtended( my_extended_string ) ) ;
std::cout << "string with all characters: " << my_extended_string << std::endl ;
std::cout << "string without control characters: " << no_controls << std::endl ;
std::cout << "string without extended characters: " << no_extended << std::endl ;
return 0 ;
}
```

Output:

```string with all characters: K�O:~���7�5����
���W��@>��ȓ�q�Q@���W-
string without control characters: K�O:~���7�5����
���W��@>��ȓ�q�Q@���W-
string without extended characters: KO:~75W@>qQ@W-
```
```before sanitation : �L08&YH�O��n)�:���O�G\$���.���"zO���Q�?��