3

I thought this was going to be pretty simple, but I've been struggling with it now for a while. I know there are CSS parser classes out there that can acheive what I want to do... but I don't need 95% of the functionality they have, so they're not really feasible and would just be too heavy.

All I need to be able to do is pull out any class and/or ID names used in a CSS file via regex. Here's the regex I thought would work, but hasn't.

[^a-z0-9][\w]*(?=\s)

When run against my sample:

.stuffclass {
 color:#fff;
 background:url('blah.jpg');
}
.newclass{
 color:#fff;
 background:url('blah.jpg');
}
.oldclass {
 color:#fff;
 background:url('blah.jpg');
}
#blah.newclass {
 color:#fff;
 background:url('blah.jpg');
}
.oldclass#blah{
 color:#fff;
 background:url('blah.jpg');
}
.oldclass #blah {
 color:#fff;
 background:url('blah.jpg');
}
.oldclass .newclass {
 text-shadow:1px 1px 0 #fff;
 color:#fff;
 background:url('blah.jpg');
}
.oldclass:hover{
 color:#fff;
 background:url('blah.jpg');
}
.newclass:active {
 text-shadow:1px 1px 0 #000;
}

It does match most of what I want, but it's also including the curly brackets and doesn't match the ID's. I need to match the ID's and Classes separately when conjoined. So basically #blah.newclass would be 2 separate matches: #blah AND .newclass.

Any ideas?

===================

FINAL SOLUTION

I wound up using 2 regex to first strip out everything between { and }, then simply matched the selectors based on the remaining input.

Here's a full working example:

//Grab contents of css file
$file = file_get_contents('css/style.css');

//Strip out everything between { and }
$pattern_one = '/(?<=\{)(.*?)(?=\})/s';

//Match any and all selectors (and pseudos)
$pattern_two = '/[\.|#][\w]([:\w]+?)+/';

//Run the first regex pattern on the input
$stripped = preg_replace($pattern_one, '', $file);

//Variable to hold results
$selectors = array();

//Run the second regex pattern on $stripped input
$matches = preg_match_all($pattern_two, $stripped, $selectors);

//Show the results
print_r(array_unique($selectors[0]));
3
  • Why not use a complete CSS parser to extract the selectors? Commented May 18, 2012 at 14:41
  • What's wrong with a CSS parser? Have you run any benchmarks? Don't rule out just because you think it'd be "too heavy". Commented May 18, 2012 at 14:43
  • lol doh! I misspelled my own name... pfft. and I don't use a complete CSS parser because, as mentioned above, they're just far too heavy and bloated for what I want to do... They include a TON of functionality that I would never use. A simple one line regex would be ideal for this if I could just get it worked out. Commented May 18, 2012 at 14:44

4 Answers 4

1
[^a-z0-9][\w]+(?=\s)

I changed your * to a + match

It works fine in RegEXR - an awesome regex development tool: http://gskinner.com/RegExr/ (See bottom right of window to download the desktop version)

Sign up to request clarification or add additional context in comments.

4 Comments

perfect! thanks a ton! oops, spoke too soon... it still won't match the ID's
sorry, to be more specific... it does match the ID's when they're "standalone"... but when conjoined with a class, it does not include them in the match. #blah.newclass
Download RegExr and you can play with the Regex until it fits ;) Hover over each part of your regex in the top bar and RegExr will give you an explanation of what it's matching. Failing that I can have a look when I get a chance, but that may not be until tomorrow.
I wound up using 2 separate regex, one to strip out {stuff} and then another to select all the remaining matches. Updated the question with final solution and accepted your answer as you did actually point me in the right direction. Thanks!
1

This version is based on nealio82's, but adding pseudo-selectors: [^a-z0-9][\w:-]+(?=\s)

Comments

0
/(?<!:\s)[#.][\w]*/

some thing like this? excludes the #FFFFFF color stuff...

1 Comment

although that one does work with the sample given above, if you add things like background: url('../img.jpg'); then ".jpg" is matched also.
0

The solution posted by OP works, though it didn't work for me with CSS classes that had hyphens. As such, I've amended the second pattern to work more effectively:

$pattern_two = '/[\.|#]([A-Za-z0-9_\-])*(\s?)+/';

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.