原文:
Hi there,
I just discovered Go and decided to port a little program to Go.
The program reads JSON-Data from an URL and process the Data. The Go
port works well till now.
I dont have any influence on the JSON data and so sometimes there are
control character in it and my program crashes with "invalid character
'\x12' in string literal"
here the code sample of my program:
http_return, err := http.Get(newurl)
var http_body []byte;
if err == nil {
http_body, err = ioutil.ReadAll(http_return.Body)
http_return.Body.Close()
}
param_info := make(map[string]interface{})
param_err := json.Unmarshal(http_body, ¶m_info)
This Unmarshal call crashes with "invalid character '\x12' in string
literal"
How do I remove this or any similar control character from the JSON
data. I dont need those characters so they can simply be removed.
Thanks
Nico
==============================================================
意思大概就是说,json是通过网络传输或者其它方法获取的,然后里面可能包含特殊字符,这样呢,在go里面用json 解析就会报错。
于是有个哥们就出了一招。
========================
If you don't mind them being replaced with spaces, you could do something like:
for i, ch := range http_body {
switch {
case ch > '~': http_body[i] = ' '
case ch == '\r':
case ch == '\n':
case ch == '\t':
case ch < ' ': http_body[i] = ' '
}
}
~K
========================================
具体什么原理呢?还不清楚。
大概是根据字符集来判断哪些是json里面的非法字符,然后替换成空格。
计算机,字符集。。。。。后面需要研究一下了。
===================================
看一下ASCII可显示字符表,就明白了。 空格开始,~结束。加上\r\n\t这3个字符。
ASCII可显示字符
二进制 | 十进制 | 十六进制 | 图形 |
0010 0000 |
32 |
20 |
(空格)(␠) |
0010 0001 |
33 |
21 |
! |
0010 0010 |
34 |
22 |
" |
0010 0011 |
35 |
23 |
# |
0010 0100 |
36 |
24 |
$ |
0010 0101 |
37 |
25 |
% |
0010 0110 |
38 |
26 |
& |
0010 0111 |
39 |
27 |
' |
0010 1000 |
40 |
28 |
( |
0010 1001 |
41 |
29 |
) |
0010 1010 |
42 |
2A |
* |
0010 1011 |
43 |
2B |
+ |
0010 1100 |
44 |
2C |
, |
0010 1101 |
45 |
2D |
- |
0010 1110 |
46 |
2E |
. |
0010 1111 |
47 |
2F |
/ |
0011 0000 |
48 |
30 |
0 |
0011 0001 |
49 |
31 |
1 |
0011 0010 |
50 |
32 |
2 |
0011 0011 |
51 |
33 |
3 |
0011 0100 |
52 |
34 |
4 |
0011 0101 |
53 |
35 |
5 |
0011 0110 |
54 |
36 |
6 |
0011 0111 |
55 |
37 |
7 |
0011 1000 |
56 |
38 |
8 |
0011 1001 |
57 |
39 |
9 |
0011 1010 |
58 |
3A |
: |
0011 1011 |
59 |
3B |
; |
0011 1100 |
60 |
3C |
< |
0011 1101 |
61 |
3D |
= |
0011 1110 |
62 |
3E |
> |
0011 1111 |
63 |
3F |
? |
|
|
二进制 | 十进制 | 十六进制 | 图形 |
0100 0000 |
64 |
40 |
@ |
0100 0001 |
65 |
41 |
A |
0100 0010 |
66 |
42 |
B |
0100 0011 |
67 |
43 |
C |
0100 0100 |
68 |
44 |
D |
0100 0101 |
69 |
45 |
E |
0100 0110 |
70 |
46 |
F |
0100 0111 |
71 |
47 |
G |
0100 1000 |
72 |
48 |
H |
0100 1001 |
73 |
49 |
I |
0100 1010 |
74 |
4A |
J |
0100 1011 |
75 |
4B |
K |
0100 1100 |
76 |
4C |
L |
0100 1101 |
77 |
4D |
M |
0100 1110 |
78 |
4E |
N |
0100 1111 |
79 |
4F |
O |
0101 0000 |
80 |
50 |
P |
0101 0001 |
81 |
51 |
Q |
0101 0010 |
82 |
52 |
R |
0101 0011 |
83 |
53 |
S |
0101 0100 |
84 |
54 |
T |
0101 0101 |
85 |
55 |
U |
0101 0110 |
86 |
56 |
V |
0101 0111 |
87 |
57 |
W |
0101 1000 |
88 |
58 |
X |
0101 1001 |
89 |
59 |
Y |
0101 1010 |
90 |
5A |
Z |
0101 1011 |
91 |
5B |
[ |
0101 1100 |
92 |
5C |
\ |
0101 1101 |
93 |
5D |
] |
0101 1110 |
94 |
5E |
^ |
0101 1111 |
95 |
5F |
_ |
|
|
二进制 | 十进制 | 十六进制 | 图形 |
0110 0000 |
96 |
60 |
` |
0110 0001 |
97 |
61 |
a |
0110 0010 |
98 |
62 |
b |
0110 0011 |
99 |
63 |
c |
0110 0100 |
100 |
64 |
d |
0110 0101 |
101 |
65 |
e |
0110 0110 |
102 |
66 |
f |
0110 0111 |
103 |
67 |
g |
0110 1000 |
104 |
68 |
h |
0110 1001 |
105 |
69 |
i |
0110 1010 |
106 |
6A |
j |
0110 1011 |
107 |
6B |
k |
0110 1100 |
108 |
6C |
l |
0110 1101 |
109 |
6D |
m |
0110 1110 |
110 |
6E |
n |
0110 1111 |
111 |
6F |
o |
0111 0000 |
112 |
70 |
p |
0111 0001 |
113 |
71 |
q |
0111 0010 |
114 |
72 |
r |
0111 0011 |
115 |
73 |
s |
0111 0100 |
116 |
74 |
t |
0111 0101 |
117 |
75 |
u |
0111 0110 |
118 |
76 |
v |
0111 0111 |
119 |
77 |
w |
0111 1000 |
120 |
78 |
x |
0111 1001 |
121 |
79 |
y |
0111 1010 |
122 |
7A |
z |
0111 1011 |
123 |
7B |
{ |
0111 1100 |
124 |
7C |
| |
0111 1101 |
125 |
7D |
} |
0111 1110 |
126 |
7E |
~ |
|