问使用Python Regex提取代码的非注释部分
EN

Stack Overflow用户

提问于 2018-08-01 04:56:41

回答 1查看 71关注 0票数 0

我正在尝试使用Python提取c代码的“非注释”部分。到目前为止，我的代码可以在这些示例中提取"non_comment“，如果找不到，它只返回"”

// comment
/// comment
non_comment;
non_comment; /* comment */
non_comment; // comment
/* comment */ non_comment;
/* comment */ non_comment; /* comment */
/* comment */ non_comment; // comment

下面是源代码，我使用doctest对不同的场景进行单元测试

import re
import doctest

def remove_comment(expr):
  """
  >>> remove_comment('// comment')
  ''
  >>> remove_comment('/// comment')
  ''
  >>> remove_comment('non_comment;')
  'non_comment;'
  >>> remove_comment('non_comment; /* comment */')
  'non_comment;'
  >>> remove_comment('non_comment; // comment')
  'non_comment;'
  >>> remove_comment('/* comment */ non_comment;')
  'non_comment;'
  >>> remove_comment('/* comment */ non_comment; /* comment */')
  'non_comment;'
  >>> remove_comment('/* comment */ non_comment; // comment')
  'non_comment;'
  """
  expr = expr.strip()
  if expr.startswith(('//', '///')):
      return ''
  # throw away /* ... */ comment, and // comment at the end
  pattern = r'(/\*.*\*/\W*)?(\w+;)(//|/\*.*\*/\W*)?'
  r = re.search(pattern, expr)
  return r.group(2).strip() if r else ''    

doctest.testmod()

然而，不知何故，我不喜欢这些代码，我相信应该有更好的方法来处理这个问题。有没有人知道更好的方法呢？谢谢!

python

regex

回答 1

Stack Overflow用户

发布于 2018-08-01 05:18:25

不是提取所有非注释，而是尝试通过用""替换注释来删除注释。

Demo

\/\/.*|\/\*[^*]*\*\/是一种模式。它将捕获/*...*/周围或以//开头的任何内容

票数 0

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/51621936

复制

相似问题

问使用Python Regex提取代码的非注释部分
EN

回答 1

Stack Overflow用户

社区

活动

资源

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问使用Python Regex提取代码的非注释部分EN

回答 1

Stack Overflow用户

社区

活动

资源

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问使用Python Regex提取代码的非注释部分
EN