Skip to Main Content

SQL & PL/SQL

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

Combining multiple regexp_like statements & correct regex syntax

4169408Feb 2 2020 — edited Feb 3 2020

I’m working on creating various 0/1 product flags category variables based on multiple substrings within a string field. The string values contained in the category_description field don’t stick to a consistent classification pattern. Notice for each flag variable I’m using two regexp_like statements. The two things I’m trying to accomplish are.

  1. For each category flag variable combine the two regexp_like statements into only one regexp_like statement
  2. I’m still trying to correct some of the issues I’m running to with the regex syntax.
    • Swimsuit classifying categories which actually dress suits and not swimsuits or flagging jumpsuits as swimsuits
    • A category is being flagged as swimwear when it represents just shorts. Example: Men > Clothing > Shorts & Swimwear > Shorts
    • Situations where there are no spaces between substrings. Example: where the description Layette/Infant Unisex - Accessories – Sale is not being flagged as baby when it contains the word infant

Below is my current code, sample data, and my desired output. Any assistance will be greatly appreciated.  Thanks! 

Sample data:

CREATE TABLE category_flags
(
category_description VARCHAR
(80)
);
INSERT INTO category_flags
  
(category_description)
VALUES

("Baby > Basics > Accessories > Hats,Scarves, & Gloves")
("Men > Shoes & Accessories > Hats")
("Men > Shoes & Accessories > Ties")
("Women > Clothing > Jackets")
("Women > Shoes & Accessories > Hats")
("Kids > Boys > Jackets")
("Tall/Men/Outerwear/Cloth/Sale")
("Men > Bottoms > Swim")
("Men > Bottoms > Swimwear")
("US/Big_Tall/Men/Chinos/Flat_Front")
("men>clothing>sweatshirts")
("men>clothing>swimwear")
("Children/Girls/Outfit")
("Men > Accessories > Fragrance")
("Black_Top/Men/Dress_Shirts/Slim_Fit/Sale")
("Men > Bottoms > Suits, Sport Coats & Trousers")
("Women > Tops > Sweaters")
("Kids > Accessories > Boy's Accessories > Bags")
("boys>accessories>socks")
("Blue_Look/Women/Pants/Sale")
("women > Clothing/Knits/> Jackets")
("Women > Clothing > Jackets")
("Women > Shoes & Accessories > Hats")
("Women > Clothing > Jackets")
("Dresswear/Women > Belts")
("clothing > men >shorts & swim trunks > swimtrunks")
("Men > Clothing > Shorts & Swimwear > Shorts")
("men>clothing>swimsuits")
("boys>accessories>socks")
("Layette/Infant Unisex - Accessories")
("Layette/Infant Unisex - Accessories - Sale")
("Basics > Baby Boy & Girl> Accessories > Hats,Scarves, & Gloves")
("Women/Clothing & Shoes/Swimwear")
("Evening/Men > Shoes & Accessories > Ties")
("Summer/Child/Jumpers")
("women/fall/clothing & accessories/jumpsuits")
("Women/Clothing & Swimwear/Skirts")
);

Current script:

select

category_description,
case when regexp_like(category_description, '(.[^o]|[^w]o)men|^men','i')  
and not regexp_like(category_description, 'accessories|hats|bags|belts|shoes|socks|ties' ,'i')   then 1 else 0 end as Mens_Apparel,

case when regexp_like(category_description, '(.[^o]|[^w]o)men|^men','i')  
and  regexp_like(category_description, 'accessories|hats|bags|belts|shoes|socks|ties' ,'i')   then 1 else 0 end as Mens_Accessories,

case when regexp_like (category_description, 'women','i')  
and not regexp_like(category_description, 'accessories|hats|bags|belts|shoes|socks|ties' ,'i')   then 1 else 0 end as Womens_Apprarel,

case when regexp_like (category_description, 'women','i')  
and  regexp_like(category_description, 'accessories|hats|bags|belts|shoes|socks|ties' ,'i')   then 1 else 0 end as Womens_Accessories ,

case when regexp_like (category_description, 'boys?|girls?|junior|child(ren)?|kids?','i')
and not  regexp_like (category_description,'infant|babys?|hidden|womens?|mens?|gift cards?|test','i') then 1 else 0 end as Kids_Products  ,

case when regexp_like(category_description,'toddler|infant|babys?|','i')
and not  regexp_like (category_description,'hidden|womens|mens|^gift cards','i') then 1 else 0 end as Baby_Products,

case when  regexp_like(category_description, '(.[^o]|[^w]o)men|^men','i') 
  
and regexp_like(category_description,'swim(wear)?|swim(suit)?','i')then 1 else 0 end as  Mens_Swimwear  ,

case when  regexp_like(category_description,'womens','i')
and regexp_like(category_description,'(swim(wear|suits))','i') then 1 else 0 end as  Womens_Swimwear 

from category_flags

         

category_descriptionMens_ApparelWomens_ApprarelMens_AccessoriesWomens_AccessoriesMens_SwimWomens_SwimKids_ProductsBaby_Products
Baby > Basics > Accessories > Hats,Scarves, & Gloves00000001
Men > Shoes & Accessories > Hats00100000
Men > Shoes & Accessories > Ties00100000
Women > Clothing > Jackets01000000
Women > Shoes & Accessories > Hats00010000
Kids > Boys > Jackets00000010
Tall/Men/Outerwear/Cloth/Sale10000000
Men > Bottoms > Swim10001000
Men > Bottoms > Swimwear10001000
US/Big_Tall/Men/Chinos/Flat_Front10000000
men>clothing>sweatshirts10000000
men>clothing>swimwear10001000
Children/Girls/Outfit00000010
Men > Accessories > Fragrance00100000
Black_Top/Men/Dress_Shirts/Slim_Fit/Sale10000

Comments

Post Details

Added on Feb 2 2020
4 comments
3,230 views