[Regex] matching issues

10 posts

Flag Post

s/(?=(.))(?<!\1)\1(?!\1)/$1$1/g

theoretically, the above regex (in an implementation that supports lookbehind) will have these results (bools for x):

str bool  result
  x true  xx
 xx false xx
 xy true  xxyy
 yx true  yyxx
yxx false yyxx
xxy false xxyy
yxy true  yyxxyy
xxx false xxx

but it doesn’t appear to match at all, or is not compiling to a regex. any ideas?

 
Flag Post
(?=(.))(?<!\1)

I think these parts are conflicting to start.

If you haven’t checked this out, it might be worth your time.

 
Flag Post

Originally posted by UnknownGuardian:

(?=(.))(?<!\1)

I think these parts are conflicting to start.

both are atomic, when they finish they’re still before the first character; i did find out that java doesn’t like it, though:

java.util.regex.PatternSyntaxException: Look-behind group does not have an obvious maximum length near index 12
(?=(.))(?<!\1)\1(?!\1)

despite the fact the maximum length is “1” and thee minimum length is “1”

 
Flag Post

How about something like this: /(?=(.))(?<!(?=\1).)\1(?!\1)/g

Lookbehinds may not be able to be variable length (even though, contextually, it’s not), but lookaheads can be. AS3 (which I was using to test) doesn’t support that substitution syntax, but you get the idea.

 
Flag Post

Originally posted by BigJM:

How about something like this: /(?=(.))(?<!(?=\1).)\1(?!\1)/g

Lookbehinds may not be able to be variable length (even though, contextually, it’s not), but lookaheads can be. AS3 (which I was using to test) doesn’t support that substitution syntax, but you get the idea.

…it works. what the hell. that makes the error even more pointless. i hate java.

unrelatedly, i hate lua. in fact, i hate it so much i had to tell you i hated it in a topic that has nothing to do with lua.

 
Flag Post

Haha why do you hate lua?

That is a stupid error though. Given that each backreference is associated with a particular capturing group, it seems pretty trivial to make the parser actually look at that group and determine if it’s fixed or variable length. Furthermore, I think some implementations allow you to have lookbehinds of variable length. But at least be happy that you get an error; AS3 just sits there and silently fails.

 
Flag Post

Originally posted by BigJM:

Haha why do you hate lua?

That is a stupid error though. Given that each backreference is associated with a particular capturing group, it seems pretty trivial to make the parser actually look at that group and determine if it’s fixed or variable length. Furthermore, I think some implementations allow you to have lookbehinds of variable length. But at least be happy that you get an error; AS3 just sits there and silently fails.

the /.../ version should give you a compiler error, and throwing it into the RegExp constructor should throw an error.

i hate lua because it doesn’t support UTF8 strings, doing raw bytes in a string is done as "\192" (yes, decimal) instead of in octal, or with \x## or \u####, accessing a string gives you only the raw bytes and you must manually (without bitwise operators!) decode or encode UTF8 after accepting/before outputting the data, it uses ~= for != and not for simply inverting a boolean variable; there are no combination operators such as += so every operation like that is 30% longer. an if block is written as: if cond then end which consumes substantially more space than if(){} and is harder to pick out because it’s all words, if 0 then print("Hello!") end prints “Hello!”; requiring explicit comparisons for non-booleans, unless you have nil. nil is false. it completely lacks switch statements, which seems like a minor thing until you need to build another structure and end up using either function calls or an if/else tree/ladder (O(log n) or O(n) performance, respectively), as well as completely lacking try/catch. and dozens of other small things that add up to make it a huge pain in the ass for any experienced programmer to actually use (i’d prefer perl or cobol. hell, i’d rather write hex x86 assembly), and makes it even more difficult for anyone who learns it as a first language to move to any other programming language. they’ve tried so damned hard to make the language “simple” that they’re omitting basic functionality and made it incredibly difficult to read or write for experienced programmers.

 
Flag Post

Yeah, Lua gets messy pretty fast.
ie, a simple “kick” command for a chat I made a while ago:

--[[kick]]gamooga.onmessage("kick",function(id,u)
	if admins[id] then
		if users[u] then
			bid=users[u]
			if admins[bid] then return		
			else 			
				ids[bid]=nil
				gamooga.send(bid,"b")
				gamooga.disconnect(bid)
				gamooga.broadcast("bye",{n=u})	
				if muted[bid] then muted[bid]=nil end
				users[u]=nil
				end
			end
		end
	end
)

And that’s something that you should be able to do in a couple of lines in any sane language.
Now think about any semi-complex task and you’ll see how it would result in a hell of if/then/else/end groups.

 
Flag Post

as condensed as i can make it:

--[[kick]]gamooga.onmessage("kick",function(id,u)
	if admins[id] and users[u] then
		id=users[u]
		if not admins[id] then
			ids[id]=nil
			gamooga.send(id,"b")
			gamooga.disconnect(id)
			gamooga.broadcast("bye",{n=u})
			muted[id]=nil
			users[u]=nil
		end
	end
end)

and an AS3 reference:

/*kick*/gamooga.onmessage("kick", function(id:int, u:String) {
	if (admins[id] && users[u] && !admins[id=users[u]])
		gamooga.send(id,"b"),
		gamooga.disconnect(id),
		gamooga.broadcast("bye", {n:u}),
		ids[id] = muted[id] = users[u] = null;
})
 
Flag Post

Ah, so they have and & not operators.
A bit late to update that thing, but good to know. Although I don’t think I’ll be using Lua again anytime soon >_>